Servers​ - What kind of servers will you use, and what functionality and responsibility does each server carry?

Although this architecture increase latency, it allows improve service fault-tolerance and simplify zero-downtime deploy. But this shouldn’t be used for prototypes or MVP as implementation is time consuming and error-prone.

Persistent storage servers

Client-server​ - Explain how client-server communication works, and how does the connection keep-alive between the client and the server is done (while preserving battery on mobile devices).

I have no prior experience with battery preserving on mobile devices, so I have no actual knowledge about proper keep-alive techniques.

Main sequence from new message sending:

  1. When user create new message, application generates unique id(just int unique for device is enough).
  2. After that aplication creates full message header with some meta information(client-generated-id, sender, unique device-id, receiver, sender-timestamp)
  3. Apllication encrypts and encodes all message parts (text, audio, video) and fill some extra meta-information filds like message parts sizes and types.
  4. Open active keep-alive connection with server, if we have none.
  5. Message meta-information exchange (short text parts can be send with meta-information):
    1. Apllication sends message meta-information to server.

    2. Server process meta-information where it recieve server-id, server-timestamp and store in storage.

      Note: server-id can actually be computable on clients if it’s generated based on client-generated-id, sender-id and unique-device-id.

    3. Server put updated meta-information(with server-id and server-timestamp) in queue for receiver and sender.

    4. Sender receive and updates meta-information.

    5. Receiver receive meta-information and creates message in chat

  6. Independent of message meta-information exchange we can start message-parts exchange:
    1. In online exchange we can try use P2P connection with some NAT traversal solution(like STUN, TURN) and send message directly to receiver. In that case we can also send meta-information to mitigate some possible race-conditions.
    2. If receiver side isn’t online, we should fallback to client-server-client delivery with message-parts stored in object storage on server.
  7. After receiver received all message parts it generates receiver-timestamp and send it to server.
  8. Server process notification from reciever, save information in meta-information storage and deliver it to sender.

On receiver side, if we have active keep-alive connection with server we just react on messages.

If we have no active connection, we should await for push-notification from server via standard mobile-platform services. But becase this services is actually non-guaranteed, we also should ping our server with simple request like “do you have any new data for me?”. This ping requests also can be used to determine online status of users. If receive notification about new data we start new keep-alive connection.

Keep-alive connection can be implemented via two open TCP-sockets where one always ready to send data to server via HTTP and second always in state of awaiting response from server (long-poll HTTP, XEP-0124: Bidirectional-streams Over Synchronous HTTP (BOSH)).

Networking​ - What networking protocols will be used and how push-notifications will work?

In case of online communication we can use P2P connection and send message bodies directly.

As for push-notifications

Storage & Data​ - What kind of data will you store, and what will you use to store it?

For initial structure of service I think of following high-level models

Scale​ - Explain how the system will scale over time.

Authentication & Encryption (lower priority for this task)​ - describe how the signup process will work, how authentication is done, and how end-to-end encryption will be implemented.

Signup

  1. User generates(selects) username and some communication method(phone, email) to server.
  2. Server generates one-time token and send it back to user communication method.
  3. User provides one-time token back to server (can be automatically received in application from SMS).
  4. Server validating token:
    • If we signup via web-site, server requests password.
    • If we signup via application, we made hidden client-server exchange and create oauth2 tokens, which stored on client and server.

Signin

As for security, although HTTPS provides encryption, in later years was discovered too many vulnerabilities, so it’s possible to add extra security to signin process using techniques based on zero-knowledge password proof.

End-to-end encryption

This is extremely big questions and can’t be properly solved without defined requirements. One of possible solutions is when two clients start a chat, they generate symmetric key and make secure key-exchange procedure which prevents MITM attack.

For actual message encryption can used AES256 as it have no known vulnerabilities, for key-exchange should be used some protocol based on assymetric cryptography.

Most sensitive part here is initial key-exchange as it should avoid key disclosure to server or any other third-party.