Index Search Overview
General
Section titled “General”Text Processing
Section titled “Text Processing”All searchable words must consist of alpha-numeric characters. All characters outside of this definition are treated as word separators. The following list includes examples and is not exhaustive:
- Non-printable characters e.g.
blanks,tabs,new lines - Printable characters for punctuation e.g.
.?!,:;-_[]{}()`´'" - Currency symbols e.g.
$£€ - Other printable special characters e.g.
+*<>&/\^~
Case Sensitivity
Section titled “Case Sensitivity”Searches are executed case insensitive and a search for energy will find all variations e.g. Energy, energy, ENERGY, EnErGy.
Accented Characters
Section titled “Accented Characters”English uses diacritics sparingly compared to many other languages, particularly those derived from Latin. For compatibility all diacritics are removed to map any accented character to their base character. A search for énergie is the same as energie and will hit on both notations.
Term handling
Section titled “Term handling”| Single term | Enter a single word e.g. energy to show Documents containing that word. |
| Phrase term | Enter multiple words e.g. energy plan to show Documents containing that exact phrase. For better readability it is recommended to enclose phrases in straight quotes e.g. "energy plan". |
| Multiple terms | Terms and phrases can be linked with logical operators which is explained in the next section. |
Operators
Section titled “Operators”All connected terms are required to be present. For example energy AND plan AND resources requires to have all three terms to match.
Any single or multiple occurrence would yield a match e.g. energy OR plan OR resources.
Negation of the immediately following term or phrase.
NOT planwill yield Documents not having the word plan present.energy AND NOT planwill yield Documents having the term energy present and term plan absent.energy AND NOT "energy plan"will show Documents with termenergybut any Document containing phrase"energy plan"will be excluded.
Find a term within distance (n) of another term with no specific order. For example energy W/5 plan* is the same search as plan* W/5 energy and will yield Documents containing the following sentences:
- … energy plan …
- … planned energy …
- … energy was distributed according to plan …
- … plants deliver insufficient energy …
- … energy distribution requires extensive planning …
Wildcards
Section titled “Wildcards”?
Use question mark to substitute one indistinct character. For example plan? will hit on plans, plane, plant but not plan or planned.
Use asterisk to match optional indistinct characters. For example plan* will hit on plan, plans, plant, plane, plants, planes, planned.
Use the equal sign to match any single digit. For example 20== will hit on 2000, 2001, 2002, […], 2098, 2099.
Fuzziness
Section titled “Fuzziness”Fuzzy search allows matching terms that are within a distance of n from the search term. The distance is limited to be either 1 or 2. A distance of 1 means the match can differ by a single character insertion, deletion, substitution, or transposition. For example, the query plan%1 might return results like pan, plain, plank, or plant.
The effectiveness of fuzzy search increases with longer words. For instance, searching for energy%1 may yield few results beyond exact matches or simple typos. However, increasing the distance with energy%2 may retrieve matches like synergy, emerge, entry, or every. While fuzziness can help catch typos or near matches, it may also significantly reduce precision by including unrelated terms.
Grouping
Section titled “Grouping”( )
Use parenthesis to clarify processing order in complex queries e.g. energy AND (plan OR resources).
Regular Expressions
Section titled “Regular Expressions”General
Section titled “General”A regular expression term is started with straight quotes and two pound signs to contain the pattern and is closed by straight quotes: "##...".
Given the complexity of regular expressions, only a selected subset of commonly used patterns is shown below.
Character Classes
Section titled “Character Classes”Shorthand
Section titled “Shorthand”. any character\d digit\D not digitUnicode
Section titled “Unicode”\pX unicode character class identified by abbreviation\p{Greek} unicode character class identified by name\PX negated unicode character class identified by abbreviation\P{Greek} negated unicode character class identified by nameBracket
Section titled “Bracket”[0-9] digit character class[a-z] matching any character in range a-z[abc] matching either a, b or c[^abc] matching any character except a, b and c[[:digit:]] digit character class[[:alpha:]] alphabet character class[[:^alpha:]] negated alphabet character classRepetitions
Section titled “Repetitions”a? zero or one of aa* zero or more of aa+ one or more of aa{n} exactly n aa{n,m} at least n a and at most m aa{n,} at least n a