I’m building a MarkLogic cluster that will be used by accounting firms to manage client tax data.
· Each accounting firm will have a unique set of users and clients.
· Every accounting firm needs their own isolated “dataspace” to store client data.
· Need to support 5,000 accounting firms.
· Need to have a “physical” firewall to isolate data of each accounting firm.
· It is very important to not commingle data between accounting firms.
· Account Firm A – Will have 20 users that will manage data for 40 clients.
· Account Firm B – Will have 25 users and will manage data for 100 clients.
· Account Firm C – Will have 100 users and will manage data for 500 clients.
I see 2 approaches: option 1 - Application/User Level and option 2 - Port/Database Level
Option 1 is the salesforce.com approach where each firm has a unique REST endpoint with unique set of users/permissions.
Option 2 is to give every accounting firm a unique database and port numbers. This means the MarkLogic cluster will have more than ~10,000 forests with ~5,000 unique HTTP servers.
If Option 2 is used, the accounting firm on boarding process will be fully automated. A web app will be created that will utilize the REST Management APIs. => http://docs.marklogic.com/REST/management
1. What is the best approach to support the “multiple tenants” like this?
2. For option 2, is the use of unique databases/port numbers for 5,000 firms considered too much?
3. For option2, how will 10,000 forests on a 3 node cluster impact performance considering the 2 CPU core per forest rule of thumb?
Enterprise NoSQL Developer