Friday, August 23, 2019

Kafka Architecture



Cluster Architecture

Apache Kafka Architecture and Components





Components:

Kafka Broker : Kafka cluster typically consists of multiple brokers to maintain load balance. Kafka brokers are stateless, so they use ZooKeeper for maintaining their cluster state.

Producer : Producers publish data to the topics. Producers are the publisher of messages to one or more Kafka topics.

Consumer : Consumers read data from brokers. Consumers subscribes to one or more topics and consume published messages by pulling data from the brokers.

Topic : Stream of records. A topic is a category or feed name to which records are published.

Partition: For each topic, the Kafka cluster maintains a partitioned log. Each partition is an ordered, immutable sequence of records that is continually appended to—a structured commit log. The records in the partitions are each assigned a sequential id number called the offset that uniquely identifies each record within the partition.

NoteKafka only provides a total order over records within a partition, not between different partitions in a topic. 


Monday, August 19, 2019

Generate integer from 1 to 7 with equal probability

Generate integer from 1 to 7 with equal probability?
Given a function foo() that returns integers from 1 to 5 with equal probability, write a function that returns integers from 1 to 7 with equal probability using foo() only. Minimize the number of calls to foo() method. 
Sol:
We can generate from 1 to 21 with equal probability using the following expression.
 5*foo() + foo() -5 
Let us see how above expression can be used.
1. For each value of first foo(), there can be 5 possible combinations for values of second foo(). So, there are total 25 combinations possible.
2. The range of values returned by the above equation is 1 to 25, each integer occurring exactly once.
3. If the value of the equation comes out to be less than 22, return modulo division by 7 followed by adding 1. Else, again call the method recursively. The probability of returning each integer thus becomes 1/7.
// Returns 1 to 7 with equal probability
public static int getRandom()  
    int i; 
    i = 5*foo() + foo() - 5
    if (i < 22
        return i%7 + 1
    return getRandom(); 

Find most common name in the table

How can I find the most frequent value in a given column in an SQL table?
For example, for this table it should return two since it is the most frequent value:
one
two
two
three
Ans:
SELECT ColName, COUNT(ColName) AS count
FROM Table    
GROUP BY ColName
ORDER BY count DESC
LIMIT 1;
Note: Increase 1 if you want to see the N most common values of the column.

Monday, August 5, 2019

Design Interview Questions

1. Design TinyURL or bitly (a URL shortening service)

2. Design YouTube, Netflix or Twitch (a global video streaming service)

3. Design Facebook Messenger or WhatsApp (a global chat service)

4. Designing Quora or Reddit or HackerNews (a social network + message board service)

5. Design Dropbox or Google Drive or Google Photos (a global file storage & sharing service)

6. Design Facebook, Twitter or Instagram (a social media service with hundreds of millions of users)

7. Design Uber or Lyft (a ride sharing service)

8. Design a Web Crawler or Type-Ahead (search engine related services)

9. Design Yelp or Nearby Places/Friends (a proximity server)

Steps to approach a System Design Interview

Step 1: Requirement Gathering
Step 2: System interface definition
Step 3: Capacity estimation
Step 4: High-level design (System API) : REST APIs to expose the functionality of our service
Step 5: Detailed design for selected components (Component Design)
Step 6: Database Design & Data Sharding: MySQL or NoSQL
Step 7: Load Balancing (LB)
Step 8: Cache
Step 9: Security and Permissions
Step 10: Identifying and resolving bottlenecks

References:
https://hackernoon.com/anatomy-of-a-system-design-interview-4cb57d75a53f
https://www.educative.io/collection/page/5668639101419520/5649050225344512/5668600916475904