GET /search
===========
Request Parameters:
========================== ======= ========= ======================================================================
Parameter Type Default Description
========================== ======= ========= ======================================================================
q String Lucene search String
sid String TextGrid SessionID from tgauth
target String both where to do fulltext-searches:
one of "structure", "metadata" and "both"
order String relevance key-value ascending (asc) or descending (desc) and
metadata-field like asc:title or desc:author
start String result to start with. a number or the result from the last search
results next attribute. number only works up to 10.000 hits.
limit Integer 20 number of entries to return.
kwicWidth Integer 40 number of chars before and after a kwic match
wordDistance Integer -1 max distance beetween two words in fulltext query. ignored if set
to a number < 0, then for a hit all words must be contained in one document.
path Boolean false path of found result(work->edition->aggregations) should be applied to hit
allProjects Boolean false all Projects should be searched for public data,
warning: this query may be slow, if many results found
sandbox Boolean false show sandboxed (not yet finally published) data
filter String add filter on query results, e.g. for faceting (TODO: Syntax)
facet String get facets for query results
facetlimit Integer 10 number of results to return for each facet
========================== ======= ========= ======================================================================
Response:
List of TextGrid objects found. XML, using the textgrid medataschema
Example request::
curl -s https://textgridlab.org/1.0/tgsearch-public/search?q=waldeinsamkeit
Example response::
[...]
Search syntax
-------------
The search query passed with the parameter "q" can be written in lucene syntax, as described in lucene_syntax_
Facets and filters
------------------
For facetting the search result it is possible to generate facets on metadata fields.
These facets are generated on the whole set of objetcs matching the current search request.
The parameter facet is repeatable.
Example request (facets on format and agent)::
curl -s "https://textgridlab.org/1.0/tgsearch-public/search?q=waldeinsamkeit&facet=format&facet=edition.agent.value"
Example response::
[...]
text/xmltext/tg.edition+tg.aggregation+xmltext/tg.work+xmlEichendorff, Joseph vonBechstein, LudwigKerner, JustinusSchöppner, AlexanderArnim, Ludwig Achim vonGeibel, EmanuelGrässe, Johann Georg TheodorGutzkow, KarlHeine, HeinrichLingg, Hermann von
based on this facets it is possible to apply filter, e.g. all files with format "text/xml" where the agent is "Eichendorff, Joseph von".
Example request (filter for xml from agent Eichendorff)::
curl -s "https://textgridlab.org/1.0/tgsearch-public/search?q=waldeinsamkeit&filter=format:text/xml&filter=edition.agent.value:Eichendorff,%20Joseph%20von"
Example request (all image/jpeg files from the project "Digitale Bibliothek")::
curl -s "https://textgridlab.org/1.0/tgsearch-public/search/?filter=format:image/jpeg&filter=project.id:TGPR-372fe6dc-57f2-6cd4-01b5-2c4bbefcfd3c"
Paging
------
Paging with the params start (and limit) only works up to 10.000 hits if you go with numbers as start value. Also the stability of the ordering is not
guaranted between two requests. This is due to elasticsearchs `index.max_result_window` as found in _elasticsearch_setting which defaults to 10.000, and
it is not adviced for us to change that for perfomance reasons.
So since August 2025 tgsearch supports start parameter which consits of sort order criteria and index id. (see _elasticsearch_search_after). To make it
easy for you to use this parameter, tgsearch responds with a `next` attribute, which can be used as `start` to retreive the next result set.
Example: retrieve the next 20 results (limit 20 is default) for "waldeinsamkeit" from first query example
Remember: the response was openend with this xml: ``,
so we use the next attributes value as start parameter:
curl -s "https://textgridlab.org/1.0/tgsearch-public/search?q=waldeinsamkeit&start=10.878001:vf9m.0"
The `next` attribute will look different depending on the `order` parameter. In above case the default sorting is based on relevance, so this was a
combination of score and id. The `next` attribute will contain a title if you sort by title.
The `next` attribute will be `null` if there is nothing more to get. For
perfomance reasons (see _track_total_hits) the number of total hits is only tracked on the
first search, so when no `start` param is given. Or for backwards compatibility if `start` is an integer.
.. _lucene_syntax: https://textgridrep.org/en/syntax
.. _elasticsearch_setting: https://www.elastic.co/docs/reference/elasticsearch/index-settings/index-modules
.. _elasticsearch_search_after: https://www.elastic.co/guide/en/elasticsearch/reference/6.5/search-request-search-after.html
.. _track_total_hits: https://www.elastic.co/docs/solutions/search/the-search-api#track-total-hits