The tgwf schema is designed both to simplify the task of workflow authors not having to know the semantics of GridWorkflowDL or Petri Nets, which is far more complex, and to account for some specific requirements TextGrid workflows have. It will be transferred automatically by an XSLT stylesheet to GridWorkflowDL (see below).
In the following, we just reproduce the schema here (it has some documentation inline), and show an example tgwf document afterwards.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:tgwf="http://textgrid.info/namespaces/middleware/workflow" targetNamespace="http://textgrid.info/namespaces/middleware/workflow">
<xs:annotation>
<xs:documentation>
Defines a simplified Workflow document in TextGrid. A tgwf
document written by the user will be completed by the
TextGridLab Workflow component, then xsl-transformed into a
GridWorkflowDL document which can processed by the GWES Workflow
Engine.
</xs:documentation>
</xs:annotation>
<xs:element name="tgwf">
<xs:complexType>
<xs:sequence>
<xs:element ref="tgwf:description"/>
<xs:element ref="tgwf:activities"/>
<xs:element ref="tgwf:datalinks"/>
<xs:element ref="tgwf:CRUD"/>
<xs:element ref="tgwf:batchinput"/>
<xs:element ref="tgwf:metadatatransformation"/>
<xs:element ref="tgwf:inputconstants"/>
</xs:sequence>
<xs:attribute name="version" use="required" type="xs:decimal" fixed="0.5"/>
</xs:complexType>
</xs:element>
<xs:element name="description" type="xs:string">
<xs:annotation>
<xs:documentation>
Description will not be processed and is solely for the
writer. The title of the workflow will be taken from the title
of the TextGridObject holding this tgwf document.
</xs:documentation>
</xs:annotation>
</xs:element>
<xs:element name="activities">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" minOccurs="0" ref="tgwf:service">
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="service">
<xs:annotation>
<xs:documentation>
The services proper that will process the _contents_ of the
TGOs. All data is transferred SOAP-inline, base64-encoded, so
the services will have to be compatible. CRUDread and
CRUDcreate for Grid access and StreamingEditor for metadata
transformation will be inserted automatically.
</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:attribute name="description" use="required"/>
<xs:attribute name="name" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
name for visualisation of workflow
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="operation" use="required" type="xs:anyURI">
<xs:annotation>
<xs:documentation>
the operation to be invoked from this wsdl
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="serviceID" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
this ID will be used throughout this tgwf document to
refer to this service
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="targetNamespace" use="required" type="xs:anyURI">
<xs:annotation>
<xs:documentation>
If the WSDL specifies a targetNamespace, its value can be
given here.
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="usetns" type="xs:boolean">
<xs:annotation>
<xs:documentation>
set to true to tell the Workflow Engine that the message
parameters should be prepended the targetNamespace
given. Hint: set to true if the schema definition part in
the WSDL has elementFormDefault="qualified". If you
interact with a Web Service written in a
namespace-ignorant language (such as PHP, Python, Perl, or
Tcl), usetns will perhaps better be false.
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="wsdlLocation" use="required" type="xs:anyURI"/>
</xs:complexType>
</xs:element>
<xs:element name="datalinks">
<xs:annotation>
<xs:documentation>
Determine how data flows from one service to another,
i.e. which output parameter in fromService yields the data and
which input parameter in toService will receive them. Use
crud/batchinput for fromServiceID/fromParam when the link
should lead to toServices that should receive the data as read
from the Grid. Similarly, the fromService that will serve the
final data must have a link to crud/batchoutput. Cave:
consistency checks will not be made yet, so possibly the
workflow might fail or loop.
</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" minOccurs="1" ref="tgwf:link"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="link">
<xs:complexType>
<xs:attribute name="linkID" use="required" type="xs:NCName"/>
<xs:attribute name="fromServiceID" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
the ServiceID as specified in the activities element for
the service that yields data
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="fromParam" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
the output parameter of the fromServiceID which serves the
data for this link
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="toServiceID" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
the ServiceID as specified in the activities element, of
the service that receives the data
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="toParam" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
the input parameter of the toServiceID which accepts
the data for this link
</xs:documentation>
</xs:annotation>
</xs:attribute>
</xs:complexType>
</xs:element>
<xs:element name="CRUD">
<xs:annotation>
<xs:documentation>
attribute values to be filled in automatically by the TextGridLab
</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:attribute name="instance" use="required" type="xs:string"/>
<xs:attribute name="logParameter" use="required" type="xs:string"/>
<xs:attribute name="sessionID" use="required" type="xs:string"/>
</xs:complexType>
</xs:element>
<xs:element name="batchinput">
<xs:annotation>
<xs:documentation>
input TextGridObject's URIs, to be filled in automatically by
the TextGridLab
</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element ref="tgwf:URI" maxOccurs="unbounded" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="URI" type="xs:anyURI"/>
<xs:element name="metadatatransformation">
<xs:annotation>
<xs:documentation>
This contains the XSL stylesheet for rule-based transformation
of the metadata, e.g. setting a new ProjectID, appending text
to the title, or adding an editor. Please consult an example
stylesheet for the current TextGridMetadata if you plan to
write a new one.
</xs:documentation>
</xs:annotation>
<xs:complexType mixed="true">
<xs:sequence>
<xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="inputconstants">
<xs:annotation>
<xs:documentation>
configuration parameters for the services used in this
workflow
</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" minOccurs="0" ref="tgwf:activity"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="activity">
<xs:complexType>
<xs:sequence>
<xs:element ref="tgwf:const" maxOccurs="unbounded" minOccurs="1" />
</xs:sequence>
<xs:attribute name="serviceID" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
the ServiceID as specified in the activities element
</xs:documentation>
</xs:annotation>
</xs:attribute>
</xs:complexType>
</xs:element>
<xs:element name="const">
<xs:complexType mixed="true">
<xs:sequence>
<xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute name="name" use="required" type="xs:NCName">
<xs:annotation>
<xs:documentation>
the name of this input parameter
</xs:documentation>
</xs:annotation>
</xs:attribute>
<xs:attribute name="needsB64encoding" type="xs:boolean">
<xs:annotation>
<xs:documentation>
set to true if this parameter, as the content data, has
to be encoded in Base64 for the service
</xs:documentation>
</xs:annotation>
</xs:attribute>
</xs:complexType>
</xs:element>
</xs:schema>
This document defines a two-service pipe: TextGridObjects are being sent to the TextGrid Tokenizer, then to the Lemmatizer, then resulting TextGridObjects are being created. See figure XXX for a graphical representation of this workflow in GridWorkflowDL.
<?xml version="1.0" encoding="UTF-8"?>
<tgwf:tgwf xmlns:tgwf="http://textgrid.info/namespaces/middleware/workflow" version="0.5">
<tgwf:description>
Lemmatizer Workflow with prepended Tokenizer
</tgwf:description>
<tgwf:activities>
<tgwf:service description="TextGrid Tokenizer"
name="Tokenizer"
operation="Tokenizer64"
serviceID="tok"
targetNamespace="http://namespaces.textgrid.de/"
wsdlLocation="http://ingrid.sub.uni-goettingen.de/Tokenizer.wsdl"/>
<tgwf:service operation="LemmatizerTEIBatch64"
wsdlLocation="http://ingrid.sub.uni-goettingen.de/lemmatizer_doc.wsdl"
name="Lemmatizer"
description="The TextGrid New German Lemmatizer"
serviceID="lem"
targetNamespace="http://namespaces.textgrid.de/"/>
</tgwf:activities>
<tgwf:datalinks>
<tgwf:link linkID="read" fromServiceID="crud" fromParam="batchinput"
toServiceID="tok" toParam="indata"/>
<tgwf:link linkID="Tok2Lem" fromServiceID="tok" fromParam="outdata"
toServiceID="lem" toParam="infile" />
<tgwf:link linkID="write" fromServiceID="lem" fromParam="outfile"
toServiceID="crud" toParam="batchoutput"/>
</tgwf:datalinks>
<tgwf:CRUD instance="inserted automatically"
sessionID="inserted automatically"
logParameter="inserted automatically"/>
<tgwf:batchinput/>
<tgwf:metadatatransformation>
<xsl:transform> ... </xsl:transform>
</tgwf:metadatatransformation>
<tgwf:inputconstants>
<tgwf:activity serviceID="tok">
<tgwf:const name="config" needsB64encoding="true">
<TokenizerConfig>...</TokenizerConfig>
</tgwf:const>
</tgwf:activity>
<tgwf:activity serviceID="lem">...</tgwf:activity>
</tgwf:inputconstants>
</tgwf:tgwf>