Reference Manual
Use this page as a cheatsheet. If you are a new user we recommend you start with the tutorial (coming soon).
Implementation info
Simple types
ObjectPath inherits all JSON data types and supports a few more.
- number such as
100
or2.333
- string such as
"string"
or'string'
- null such as
null
- true such as
true
ort
- false such as
false
orf
ObjectPath type rules:
- number is split into integer and float:
- integer is a fast and small integer number
- float is a slow and very accurate real number
- Strings are encoded in UTF-8,
- Added the
datetime
,date
andtime
types to handle dates and times, - Boolean types and null are case insensitive and can be written in several ways:
true
can be written as:t
,true
(ortrUe
),false
as:f
orfalse
,null
as:n
,none
,null
,nil
.- Negative are:
false
,null
,0
,""
,[]
,{}
. Any other value is positive.
Complex types
Arrays can be defined in the language by issuing:
[ val1, val2, ..., valN]
and for objects it is:
{ "atr1":val1, "atr2":val2, ..., "atrN":valN }
{ atr1:val1, atr2:val2, ..., atrN:valN }
- Attribute names can be dynamically created. Example:
{$.myCar.model: "avg temperature is "
+ avg($.cars[@.model is $.myCar.model].temp)} -> {"Toyota Prius": "avg temperature is 4.5"} - When an expression returns a value other than string, it is casted to string. Example:
{2+2:"4"} -> {"4":"4"}
- Names can be written without quotes, but in that case they cannot start with an operator. ObjectPath raises an exception if name is not valid.
Dates and times
Dates are handled as datetime
objects. There are also date
and time
objects which provide support for date and time manipulation. Datetime
, date
and time
objects support year
, month
, day
, hour
, minute
, second
and microsecond
attributes eg. now().year
-> 2011
.
Operators
How operators work
Operators take computation results from their left and right side and perform an operation on these results. While behaviour of simple operators is straightforward, operators such as .
, ..
and []
are more advanced.
Each of them works on arrays of objects on the left and apply to each element from this array the instruction found on the right. To better understand what it means, consider a document:
{ "a":1, "b":{ "c":[1,2,3] } }
and path:
$.b.c[@ > 2]
It works internally like this:
>>> $ { "a":1, "b":{ "c":[1,2,3] } } >>> $.b { "c":[1,2,3] } >>> $.b.c [1,2,3] >>> $.b.c[@ > 2] # left | right | result # [ | | [ # 1 | > 2 -> false | #nothing # 2 | > 2 -> false | #nothing # 3 | > 2 -> true | 3 # ] | | ] [3]
The result is always an array! because ObjectPath executes >
for each element in the array and filters out elements that do not match the condition.
Operator precedence
Operations are executed in the following order:
(.
= [
= (
) ->
(+ (prefix)
= - (prefix)
) ->
(*
= /
) ->
(+ (infix)
= - (infix)
) ->
(in
= not
= is
= <
= >
= <=
= >=
) ->
not
->
and
->
or
Other operators and tokens are precedence-neutral.
Arithmetic operators
Operator | Description | Example | Notes | Support |
---|---|---|---|---|
+ |
addition | 2 + 3 -> 5 |
+ is also a concatenation operator |
|
- |
subtraction | 2 - 3 -> -1 |
||
* |
multiplication | 2 * 3 -> 6 |
Alternative use is select all objects from array | |
/ |
division | 2/3 -> 0.6666666666666666 |
Integer division results in floating point number. Use int() built-in function to turn it into integer again. | |
% |
modulo | 10%3 -> 1 |
Boolean logic operators
Operator | Description | Example | Notes | Support |
---|---|---|---|---|
not |
negation | not 3 -> false |
not always casts result to boolean type | |
and |
conjunction | true and false -> false |
and evaluates expression on the left-hand side, if negative it is returned; expression on the right-hand side is not evaluated. | |
or |
alternation | true or false -> true |
or evaluates expression on the left-hand side, if positive it is returned; expression on the right-hand side is not evaluated. |
Comparison operators
ObjectPath uses algorithms that guess types of data. In most cases you don't need to worry about issues with types (e.g. comparing string "2" with number 2 works as if 2 was a string). To check if types are the same or different you can use type("2") is type(2)
-> false
.
In semi-structured data a very common issue is that the data are represented as a different type than it was supposed to be. For example very often we can see numbers represented as strings (like ["1", "2", "3"]
instead of simply [1,2,3]
). This is caused by automatic JSON generators, especially those that convert XML data into JSON. In programming languages such as Javascript this can cause many headashes and hours wasted figuring out why the code doesn't work. That's why:
1 < 2 < 2
will compare 2 with result of 1 < 2
which is true and return true. Use 1 < 2 and 2 < 3
instead.Operator | Description | Example | Notes | Support |
---|---|---|---|---|
is |
equality | '3' is 3 -> true |
||
is not |
equality negation | 3 is not 3 -> false |
||
> , >= , < , <= |
grater than, grater than or equal, less than, less than or equal | 1 > 0 -> true |
Membership tests
Operator | Description | Example | Notes | Support |
---|---|---|---|---|
in |
Checks if the result of the left side of expression is in array, object or string | 3 in [1,2,4] -> false , "ia" in "Adrian" -> true |
In objects, keys are matched. | |
not in |
Opposite behavior to in ; Equivalent to not expr in array |
1 not in [1,2,3] -> false |
Concatenation operator +
Besides standard addition of numbers, +
concatenates strings, arrays and objects.
If two arrays are concatenated, right array elements are added to the end of the left array.
If string is concatenated with array, it is added to the beggining or end of array depending on the order:
Objects are merged so that right object overwrites existing elements of left object. Object concatenation is not deep. It means that only direct child elements of root element are overwritten rather than leaf nodes.
>>> [1, 2, 4] + [3, 5] [1, 2, 4, 3, 5] >>> "aaa"+["bbb"] ["aaa", "bbb"] >>> ["bbb"]+"aaa" ["bbb", "aaa"] >>> {"a":1, "b":2} + {"a":2, "c":3} {"a":2, "b":2, "c":3}
Built-in functions
Casting functions
Casting is done by passing arguments to Python functions of the same name.
Function | Example | Notes | Support |
---|---|---|---|
str(ANY) |
str(11) -> '11' |
||
int(NUMBER/STRING) |
int(123.45) -> 123 |
||
float(NUMBER/STRING) |
float('123.45') -> 123.45 |
||
array(ANY) |
array(now()) -> [2011, 4, 8, 13, 3, 55, 747070] |
Arithmetic functions
Function | Example | Notes | Support |
---|---|---|---|
sum(ARRAY) |
sum([1, 2, 3, 4]) -> '10' |
Argument is a list of numbers. If there are float numbers in the list, sum() returns float. | |
max(ARRAY) |
max([2, 4, 1, 3]) -> 4 |
||
min(ARRAY) |
min([2, 4, 1, 3]) -> 1 |
||
avg(ARRAY) |
avg([2, 4, 1, 3]) -> 2.5 |
Equivalent to sum(array)/len(array) |
|
round(FLOAT, INTEGER) |
round(0.55, 1) -> 0.5 |
Always returns float. Second argument defines the precision of round. |
String functions
Function | Example | Notes | Support |
---|---|---|---|
replace(STRING, toReplace, replacement) |
replace('abcd','b','a') -> 'aacd' |
||
escape(STRING) |
escape('<>&\'"') -> "<>&'"" |
Converts HTML reserved characters to their HTML equivalents. | |
unescape(STRING) |
unescape("<>&'"") -> '<>&\'"' |
Reverse to the escape. | |
upper(STRING) |
upper('AaA') -> 'AAA' |
||
lower(STRING) |
lower('AaA') -> 'aaa' |
||
capitalize(STRING) |
capitalize('AaA') -> 'Aaa' |
||
title(STRING) |
title('aaa bbb') -> 'Aaa Bbb' |
||
split(STRING <, sep>) |
split('aaa bbb') -> ['aaa', 'bbb'] ,split('aaa,bbb','.') -> ['aaa', 'bbb']
|
||
slice(STRING, [start, end]) ,slice(STRING, [[start, end], [start,end], ...]) |
slice('Hello world!', [0, 5]) -> 'Hello' ,slice('Hello world!', [[0, 5], [6, 11]]) -> ["Hello", "world"]) -> ['aaa', 'bbb']
|
added in ObjectPath 0.5 read more |
The slice() function (added in ObjectPath 0.5)
Extracts a section of a string and returns a new string. The usage is:
slice(string, [start, end]) slice(string, [[start, end], [start2, end2], ...])
The first argument is always a string. The second argument can be either an array of two numbers (start and end positions) or an array of arrays of two numbers (start and end position). If position is negative, then position is counted starting from the end of the string. Examples:
slice("Hello world!", [6, 11]) -> "world"
slice("Hello world!", [6, -1]) -> "world"
slice("Hello world!", [[0, 5], [6, 11]]) -> ["Hello", "world"]
Array functions
Function | Example | Notes | Support |
---|---|---|---|
sort(ARRAY <, key>) |
sort(['c', 'b', 'a']) -> ['a', 'b', 'c'] ,sort([{v:'c', x:1},{v:'b', x:2},{v:'a', x:3}], 'v') ->
[{v:'a', x:3}, {v:'b', x:2}, {v:'c', x:1}]
|
If key is provided, will sort array of objects by key. | |
reverse(array) |
reverse([1,3,2,2,5]) -> [5,2,2,3,1] |
Reverse may be very slow for large data. | |
count(ARRAY) , len(ARRAY) |
count([1, 2, 3]) -> 3 |
Reverse to the escape. | |
join(ARRAY <, joiner>) |
join(['c', 'b', 'a']) -> 'cba' ,join(['c', 'b', 'a'], '.') -> 'c.b.a' |
Date and time functions
All date and time functions are manipulating datetime objects. The default (and the only!) timezone is UTC.
Function | Example | Notes | Support |
---|---|---|---|
now() |
now() -> '2011-04-08 13:03:55.747070' |
Gets current UTC time. | |
date(arg) |
date() -> '2011-04-08' ,date([2011, 4, 8]) -> '2011-04-08' ,date(now()) -> '2011-04-08'
|
arg can be array of structure [yyyy,mm,dd] or datetime object . If no arg is specified then date() defaults to current UTC date. |
|
time(arg) |
time() -> '13:03:55.747070' ,time([13, 3, 55, 747070]) -> '13:03:55.747070' ,time(now()) -> '13:03:55.747070'
|
arg can be array of structure [hh,mm,ss,mmmmmm] where only hour is required, or datetime object . If no arg is specified then time() defaults to current UTC time. | |
dateTime(args) |
dateTime(now()) -> '2011-04-08 13:03:55.747070' ,dateTime([2011, 4, 8, 13, 3, 55, 747070]) -> '2011-04-08 13:03:55.747070' ,dateTime(date(), time()) -> '2011-04-08 13:03:55.747070' ,dateTime([2011, 4, 8], time()) -> '2011-04-08 13:03:55.747070'
|
args: if one argument is specified then it need to be datetime object or [yyyy, mm, dd, hh, mm, ss, mmmmmm] where year, month, day, hour and minute are required. If two arguments are specified, the first argument can be date object or [yyyy, mm, dd] array, second can be time object or [hh, mm, ss, mmmmmm] array where only hour is required. | |
age(time) |
age(sometime) -> [1, 'week'] |
Counts how old the provided time is and prettyprints it. | |
toMillis(time) |
Counts milliseconds since epoch. |
Misc functions
Function | Example | Notes | Support |
---|---|---|---|
type(ANY) |
type([1,3,2,5]) -> 'array' |
Tool helpful in debugging expressions. | |
count(any), len(any) |
count("abcd") -> 4 |
Counts elements in a given argument. If element is not countable, it is returned unmodified. | |
generateID() |
Generates unique ID |
Localization function
Localize()
function tries to localize an argument. Now it works only for dates and times, but it is meant to support numbers and currencies.
It is good idea to store UTC times in the database and show localized time to the user. To show localized current time in Warsaw, Poland use localize()
function as following:
localize(now(),'Europe/Warsaw')
Paths
Paths uses dot notation:
$.attributeName[selector].attributeName2
where $
is root element, .
selects all direct child elements from node, attributeName restricts these elements to those containing attribute of name attributeName and []
contains selector expression which restrict results even more. attributeName2 selects child elements from results of previous computation.
Complete syntax
Following table contains complete syntax of paths (near identical to the JSONPath table):
Operator | Description |
$ | the root object/element |
@ | the current object/element |
. | child/property operator |
.. | recursive descent. ObjectPath borrows this syntax from E4X (and JSONPath). |
* | wildcard. All objects/elements regardless their names. |
[] | selector operator. Full documentation is available in next chapter. |
Selectors
Selector selects array elements that satisfy expression. You can be creative here and nest paths, use comparison operators or refer to other parts of the data set.
Selectors are written inside the []
operator.
Selectors can take number and behave as simple "give me n-th element from array":
>>> $.*[1] # second element from array
It also can take a string, its behavior is then similar to dot operator:
>>> $..*['string'] is $..string true
Selector can use comparison operators to find elements:
$..*[@.attribute is 'ok']
and boolean operators to handle more than one condition:
$..*[@.attribute is 'ok' or len(@.*[1]) is 2]
@
operator matches current element. Selector iterates over left expression result (which is array). @
will match the element currently checked against expression.
WARNING! @
operator is slow! Try using or/and operators in conjunction with simple expression that should match some elements or other optimization techniques.
Plans to extend ObjectPath
Most important enhancements:
- provide date and time manipulation built-in functions,
- distinguish regex functions and string functions - regexes are slow and not necessary in most cases,
- add regex matching to selectors,
Optimization plans
- make
$..*[1]
faster - generator will help, - replace operator names with numbers.