Rule Creation Without Filters

Rule Creation Without Filters  

As stated in previous articles, Creating Rules by Filtering Graph Widgets and Other Methods of Rule Development, the preferred method of rule creation is to begin with a simple text or phrase search, wait for data to aggregate, then narrow data by filtering on one or more of the many widgets available on the Live Feed Page.   However, if from experience you do know the name of a desired entity, this article will help you create rules independently of the widgets.

For all types of entities, the general syntax for rule development is tag, colon ( : ), identifying features, with no whitespace between components, except when necessary for the identifier itself.  ( <tag>:<identifiers> and <tag>:'<the identifiers>' ).  Specific examples follow.

Entity-- All extractions from documents fall under the category of entity.  This includes people, events, usernames, geolocations, and more. The tag for this type of rule is entity.  The identifiers for the entity tag are identifiers that name the type of entity.  In addition, entity is a type that can have more than one identifier for an entity.  For example, there is an entity corporation, and within this entity is the sub-entity Nike.  Therefore, the syntax to search on those documents that contain Nike is entity:CORPORATION.NIKE .  However, you could just do a search on corporations with the syntax entity:CORPORATION .  As new entities are extracted and new KE is added to the system, the list of possible entities grows; however, a list of common possible entities is available here.

Hashtag-- The hashtag type is an extension of the entity type.  The extended syntax to search for a hashtag is entity:HASHTAG.#<HASHTAG_NAME> , however, since this type was made, you can query for hashtags simply by using the tag hashtag.  The identifiers for this type are hashtags themselves, preceded by the pound sign.  For example, there is currently a hashtag in use on Twitter that is #SYRIA.  Therefore, in order to query for all documents containing this hashtag, you would write the syntax hashtag:#SYRIA .  Since hashtags can be created by any user of Twitter, a complete list of hashtags is not available.

URL-- The URL is also an extension of the entity type.  The extended syntax to search for a url is entity:URL.<url> , however, since this type was made, you can query for URLs simply by using the tag url. The identifiers for this type are URLs themselves.   For example, the BBC's website URL is http://www.bbc.com/.  Therefore, in order to query for all documents containing this URL, you would type the syntax url:http://www.bbc.com/ .  Note that this is a query for mentions of the desired URL in the text of the document, not the URL of origin. 

Geolocation-- The geolocation type is an extension of the entity type.  The extended syntax to search for a geolocation is entity:GEO_LOCATION.<LOCATION> ,  however, since this type was made, you can query for geolocations simply by using the tag location.  The identifiers for this type are countries, followed by sub-locations within the selected county.  For example, you could do a search for all documents identified as originating from Virginia by using the syntax location:'UNITED STATES.VIRGINIA' .  In this example, the single quotes around the name are necessary because there is whitespace present in the name of the country.  If you wanted to instead do a search for all documents originating from Germany, you could use the syntax, location:'GERMANY' or location:GERMANY . The quotes can be present even if no whitespace is found in the name, however they must be present if there is whitespace.  A short example of the types of locations available for use can be found here

Username-- The username type is an extension of the entity type.  The extended syntax to search for a username is entity:USERNAME.<@username> , however, since this type was created, you can query for usernames simply by using the tag username.  The identifiers for this type are the usernames themselves.  For example, the username of an arabic news Twitter account is @almanarnews.  To create a query for posts by this username or mentioning this username, you could write the syntax username:@almanarnews

Event--  The event type contains all noun-verb-noun triple extractions from the documents in the Social Media Command Center. The tag for this type is event.  The identifiers for the event tag are identifiers that name the type of event.  Like the entity type, event is a type that can have more than one identifier for an entity.  For example, there is an event attack, and within this entity is the sub-entity bomb.  Therefore, the syntax to search on those documents that contain an attack event featuring a bomb is event:ATTACK.BOMB . However, you could as well just do a search on general attacks with the syntax event:ATTACK . As new events are extracted and new KE is added to the system, the list of possible events grows; however, a list of common possible events is available here
 
Category--  This type refers to the type of speech desired in the document.  The tag for this type is category. The identifiers of this type are types of speech, such as question, negotiation terms, and more.  For example, the identifier for question-type speech is Question.  Therefore, the correct syntax to narrow to all documents containing questions is category:question . More examples of identifiers for this type can be found here.

Type-- This can narrow data based on the type of social media from which the documents are originating.  The tag for this type of rule is type.  The identifiers for the type tag are forms of social media.  For example, the identifier for google plus is gplus.  Therefore, the syntax to narrow all documents to those identified as being from google plus is type:gplus .  A list of all possible type identifiers is listed here.


 

Article ID: 20, Created On: 2/15/2012, Modified: 3/9/2012