[ How do I efficiently search a potentially large database? ]
This is more of a discussion.
We have a system which is multitenanted and will have tables that can have millions of rows. Our UI allows users to perform searches against these tables with many different search criterias -- so they can have any combination of these criteria.
It is not practical to index all these search columns in the database or to load the full tables in memory and then filter.
Can anybody point me in the correct direction for patterns/designs that tackles this issue?
Answer 1
I'm not aware of any pattern to solve the situation you describe. An unlimited number of rows, completely ad hoc queries, many simultaneous users? That's not a requirement; that's "anything goes".
I'll assume that this is a reporting database, not transactional. The data is read-only. Is that correct?
A data warehouse with a star schema would prescribe queries along thoughtfully designed dimensions. Users can roll up dimensions (e.g. time dimension allows users to roll up to day, week, month, quarter, year, etc.). But the logic to do so is executed on the database and coded in stored procedures.
I'd challenge the assertion that users need millions of rows on the middle tier. No user can take in millions of rows at once. Google returns the millions of pages returned by a single query twenty-five at a time.
Maybe you can stream down data sets that are used in a detached manner, analyzed using Excel or some other tool. But that's the best scenario I can envision.