Agreggating Data with the Cozy Client

Hi,
How to agreggate data using the cozy client?

Example:
Let’s say I have a couchDB database named “myTable”
that contains documents with this stucture {
“_id”: “23ff5f3708eebe0ca15ddecc352c55df”,
“value”: 1
}

Question #1: How can I sum the values of the documents using the cozy client?
Question #2: same as Question #1 but using a predicate condition?

Regards

The usual way of dealing with this kind of need would be a view in CouchDB world. But we have forbidden the usage of custom views inside CouchDB on Cozy (for performance reasons on our side).

For this usecase, it depends on the volume of data, you have several solutions.

  • Fetch all the data client-side and perform the sum there
  • If the volume starts to be large and downloading the data is problematic, you could have a service listening to changes and perform partial sums (and store the partial sums inside dedicated doctypes). See https://docs.cozy.io/en/cozy-stack/jobs/#event-syntax to learn how to trigger a service on data changes.
  • Do the sums inside the konnector fetching the data

For the predicate, you can do this

  • either in JS (fetching all the data and filtering afterwards)
  • using a CouchDB selector (await client.query(Q('my-doctype').where({ valu: { $gt: 5 } })))

So you have several solutions :slight_smile: Could you tell me more about your usecase ? Let me know if you need more help.

1 Like

Thank you for your prompt response.
We’re fetching and storing hourly data via our konnectors. We’d like to aggregate the fetched data by day , week, month or year in realtime depending on our users needs. So to put it bluntly, the data is pretty massive, and performing the agreggation client side might be somewhat slowish.
The map/reduce capabilities of the couchDB would’ve been great for this usecase. It’s using these capabilities through the CozyClient is what we can’t manage to make work.

I think in this case that the best way to manage this would be to have a service triggered by an event (see the linked doc in my previous post), that would aggregate the data. In such a service, it is useful to get access to the changes to the database since the previous run (so that you run your aggregation only on the latest changes). CouchDB provides a seq id that you can use to fetch the latest changes from this point in time. You can then store the latest seq somewhere, so that you know until which point the aggregation has been done.

CouchDB API: https://docs.couchdb.org/en/2.2.0/api/database/changes.html

This changes API has not been fully integrated in cozy-client as of yet, but it used “manually” in cozy-banks to run the categorization and alerting on the latest banking operations inserted in the database.

See : https://github.com/cozy/cozy-banks/blob/master/src/targets/services/onOperationOrBillCreate.js#L25

You can see that the latest changes are fetched with the help of the lastSeq number previously stored (0 if the service has never been run). You can also see that this lastSeq number is updated in the settings document of the application, so it can be retrieved in a following run.

You can see here how the service is configured to be called when a bill has been created (and also a transaction has been updated):

A 75s debounce is applied so that a konnector run inserting N transactions does not trigger the launch of N service.

1 Like