The fulltext search index is stored as a set of strings. The datastore allows you to query for items on a set as if it would be a singular attribute, like a single String for example. So you can create a query for a set of strings index={“green”, “yellow”, “red”} in this way WHERE index == “red” which would be a hit in this case. I tried to do a query like this WHERE index == “red” AND index == “green” order by date which returned with errors telling me that I need to create a datastore index for that query. I thought that the error came from the fact that I had to filters on the attribute index. But I was wrong. The problem was that I wanted to sort the results at the same time. So, by forgetting about the in-database-sorting the query works and it works darn fast. And even better: you don’t even need an index for AND connected filters like that! Which means you can save storage space (in my case almost 1GB!).
Here are some rules of thumb that I learned from this lesson. Due to the way the datastore is organized (don’t forget its BigTable nature!) you have to pay attention when designing your data model.
- You can query easily on sets
- Try to avoid non-equal filters or filters that use OR. The best filters are ones that try to find an intersection between different domains on ONE entity
- Sort your results in-memory to avoid the need for indexes
- Sets are expensive to serialize/ deserialize. This costs occure when you write or read a set from datastore. To avoid serialization costs put your full text index on a separate child object. You query on the child object only for the key. With the key you get the parent object and then you can do in-memory sorting/ ranking