preview

The For Data Extraction Problem Essay

Better Essays

Now, all string attributes are ranked by the score Score(Ai) in descending order, which can help process analysts quickly identify a set of attributes for data extraction. 3.3 Task identification Process lexicon discovered from process documents in the previous step can be further analyzed to discover a list of tasks in the processes. However, extracting correct relations among the process components is a difficult problem. Given the fact that task identification requires text mining to discover linguistic patterns for tasks. They define a task as a triple consisting of a resource, an action, and a data item, in the form of “R—A—D.” A task must contain an action A. However, either R or D (not both) may not be explicitly expressed. In order to discriminate meaningful (true) tasks from the others, this can be formalized as an intra-sentence relation extraction problem. For a sentence S consisting of resources, actions, and data items, the aim is to derive all potential task instances and identify all meaningful tasks among them using binary classification. For task identification, a procedure was designed. The following sentence was used as an example to illustrate the detailed steps of the task identification procedure: “The traveler must submit a request for reimbursement to the department within 30 days upon completing the travel.” Step 1: Task instance generation: Given the process document tagged with resources, actions, and data, first generate all

Get Access