Searching JSON as part of a SELECT command - West Wind Technologies Support

Web Connection

Searching JSON as part of a SELECT command

Searching JSON as part of a SELECT command
Harvey Mushman
All

Dec 23, 2019 @ 05:58pm

Any ideas of a fast way to search a table with JSON stored in a MEMO field ?

Below is a sample solution but it is slower than I would like to see. I'm finding on my laptop to search 1000 records it takes just over 1.5 seconds.

The only other solution I've come up with is to add a bunch of fields to the table and do a normal search for but this has its drawbacks besides being very sloppy and not allow me to maintain the flexibility of storing the values in JSON.

DO wconnect
* DEBUG
* SUSPEND
lnStart = SECONDS()
o=CREATEOBJECT('myTest')
o.myQuery()
? SECONDS()-lnStart
***********************
***********************
DEFINE CLASS myTest as Custom

 oSerializer=.null.
 ***********************
 ***********************
 FUNCTION Init()
  this.oSerializer=CREATEOBJECT('wwJsonSerializer')
 ENDFUNC
 ***********************
 ***********************
 FUNCTION myQuery()
  SELECT * FROM 'c:\dataPath\Notes' WHERE this.UDF( oJson )==.t. INTO CURSOR tQuery
 ENDFUNC
 ***********************
 ***********************
 FUNCTION UDF(lcJson)
  IF EMPTY(lcJson)
   RETURN .f.
  ENDIF
  TRY
   loJson = this.oSerializer.DeserializeJson(lcJson)
   llReturn =  .f.
  CATCH
   llReturn = .t.
  ENDTRY
  IF llReturn
   RETURN llReturn
  ENDIF
  IF PEMSTATUS(loJson,'VdrFk',5)
   IF loJson.VdrFk = 150
    RETURN .t.
   ENDIF
  ENDIF
  RETURN .f.
 ENDFUNC
 ***********************
 ***********************
 FUNCTION DESTROY()
  this.oSerializer=.null.
 ENDFUNC
 ***********************
 ***********************
ENDDEFINE

re: Searching JSON as part of a SELECT command
FoxInCloud Support - Thierry N.
Harvey Mushman

5NN02DE1M

Dec 24, 2019 @ 01:06am

depends on the proportion of records having JSON

you could eliminate faster records without JSON using (I use 'm.' to retrieve variables faster):

IF EMPTY(lcJson) or occurs('{', m.lcJson) # occurs('}', m.lcJson)

You could also do the same with less instructions, hence faster
(note: VFP evaluates after OR only if first operand is .F., after AND only if first operand is .T.):

 && replace
  IF llReturn
   RETURN llReturn
  ENDIF
  IF PEMSTATUS(loJson,'VdrFk',5)
   IF loJson.VdrFk = 150
    RETURN .t.
   ENDIF
  ENDIF
  RETURN .f.

&& by
RETURN m.llReturn OR (PEMSTATUS(loJson,'VdrFk', 5) AND loJson.VdrFk = 150)

re: Searching JSON as part of a SELECT command
Rick Strahl
Harvey Mushman

5NN03VEGF

Dec 24, 2019 @ 01:48am

Like Thierry says I think you'd want to pre-query your data to make sure the field is not empty (or NULL) to avoid having to call the UDF when there's nothing to do. IOW, let the SQL Engine do as much work to minimize the records you have to process with the UDF.

If you have 1000 records that's probably fine but if you have 10,000 the query will likely get exponentially slower, so you have to test this. If you're looking for specific values in the JSON perhaps you can pre-query the raw JSON to find the string match and only then deserialize the document with the UDF.

+++ Rick ---

re: Searching JSON as part of a SELECT command
Harvey Mushman
FoxInCloud Support - Thierry N.

5NO0CIKAB

Dec 25, 2019 @ 05:50am

Thank you both for your suggestions. I was able to get the hit time on the same 1000 records down to just under 0.9xx seconds by reducing the number of qualifying records the UDF() processed. I did this by using the EMPTY() and OCCURS() before running the TRY/CATCH DeserializeJSON() function and returning false when the record did not conform.

Then I went back and took a second look at the data and realized most of the records were not being processed by the DeserializeJSON() function. Something like 100 out of the 1000 actually got processed. So based on Rick's comment about performance, I appended 10,000 qualified records for testing the cycle time. The query results took over 17 seconds, yaks I need another solution!

I now think it is time to add new fields to the table so search hit time will again be sub-seconds results.

Thank you both again - Happy Holiday!

re: Searching JSON as part of a SELECT command
Rick Strahl
Harvey Mushman

5NP0SAX2P

Dec 26, 2019 @ 01:12pm

I think I'd probably do a query to preselect the records into a cursor. If you can filter based on string content (ie. you're matching some specific value) add that into the query and look for it in the JSON content as part of that initial query.

lcId = "12345"

*** Pre-filter query - just return what you need
select id, Json, Description from MyData ;
   where !empty(lcJson) and json like "%" + lcId + "%" ;
   into cursor TQuery

SCAN
   lcJson = TQuery.Json
   * ... do your actual processing
ENDSCAN

If you do this right you should end up just with a very small set of records to process in the SCAN loop.

The advantage of this is that you can let FoxPro run an efficient query to get the prelim results, which likely doesn't happen if you use a UDF() in the query.

You can play around with the search filter expression (ATC() or OCCURS() etc.) to get that as optimized as possible. I think using Native SQL rather than a FoxPro string command will be more efficient because any Fox command cannot be optimized while native SQL commands can be. OTOH, string searches of large blocks of text are pretty slow in FoxPro when dealing with large data sets and there's no good way to optimize that either AFAIK.

+++ Rick ---