Requirements
- behavior (what system is supposed to do)
- quality of system (how system is supposed to be)
- if interviewer gives ambiguous problem
- my reply should be the problem seems too big
- let me try to define specific functions and reduce the scope
- do we need to scale for writes or reads?
- both
- to scale for writes
- partition messages and store in separate queues
- which partition strategy to use?
- where do i store quickly?
- in memory (bounded queue or disk) or in disk (append-only log or an embedded database)
- if database
- should i pick b-tree or lsm tree?
- more probably lsm as such dbs are faster for writes
- to scale for read
- paritioning will help here too. we will have consumer per partition
- should i use pull or push for reading messages?
- if i go with pull, i need to make sure system supports long polling to decrease number of read requests
- how to get high availability
- i need to replicate messages
- leaderbased or leaderless replication?
- most likely leader based, but then i need to solve leader election problem
- that should be easy i can use coordination service or a DB that guarantees strong consistency
- how to make system reliable
- i need some protection mechanisms (load shedding, rate limiting, maybe shuffle sharding)
- should i use reverse proxy - maybe
- it will take care partition discovery and message routing
- how to make it fast?
- i should consider batching and compressing
- write the functional and non-functional requirements. keep a list.
- higher the item in the list, more important it is
when you know what concepts exist to address what requirements, you are much better set for a meaningful discussion with interviewer