Twitter, MAU = 300M, DAU = 150M

Scenario

Concurrent User:

  • QPS = DAU * average request per user / total seconds per day = 150M * 60 / 100k = 90k
  • Peak QPS = 3 * QPS = 270k
  • Read QPS = 300k
  • Write QPS = 5k

Service

  • split into micro-services

  • define features for each service

e.g.

  • user service
    • login
    • register
  • tweet service
    • post tweet
    • news feed timeline
  • media service
    • upload image
    • upload video
  • friendship service
    • follow
    • unfollow

Storage

  • SQL / NOSQL / File System?

  • schema, STAR schema, (de-)normalize

e.g.

  • user service: SQL (id, username, email, password)
  • tweet service: NoSQL (id, userId, content, createdAt)
  • media service: File System
  • friendship service: SQL / NoSQL (from_userId, to_userId)

Scale

Optimize

  • pull / push? normalize / de-normalize
  • more features?
  • special case, such as, Lady Gaga, Inactive users...

Maintenance

  • handle failure
  • scalability

News Feed

  • Twitter / Facebook / RSS reader / Wechat Friend Circle...
pull model
  • read news feed
    • get followings (friendship table)
    • get tweets from followings (tweets table)
    • merge N sorted array
    • analysis: N DB reads + merge k
  • post a tweet
    • 1 DB write (tweets table)
  • cons:
    • N * DB.getTweets(following, 100) // N DB read is expensive and blocking process
    • how to??? -> cache
      • cache each user's timeline (top 100 tweets) (reduce pull time)
      • cache each user's news_feed (reduce merge time)
  • pros:
    • 1 DB write
push model
  • news feed table
    • Id, ownerId, tweetId, createdAt
  • read news feed
    • 1 DB read from news feed table, top k latest
  • post a tweet
    • insert to tweet table
    • get followers from friendship table
    • insert and fan out to followers in news feed table
  • cons:
    • when followers is huge, Lady Gaga, push may take very long
    • how to??? ->
      • rank followers by weight (e.g. last login time)
      • mark as "star" user. "star" user do not push tweets to news feed table. Followers pull from "star" user timeline
        • merge results from news feed table and "star" user timeline
  • pros:
    • get followers and fan out can be async
    • 1 DB read
什么时候用Push?
  • 资源少

  • 想偷懒,少写代码

  • 实时性要求不高

  • 用户发帖比较少

  • 双向好友关系,没有明星问题(比如朋友圈)

什么时候用Pull ?
  • 资源充足

  • 实时性要求高

  • 用户发帖很多

  • 单向好友关系,有明星问题

Follow / Unfollow

  • Follow一个用户之后,异步地将他的Timeline合并到你的News Feed中

    • Merge timeline into news feed asynchronously.
  • Unfollow一个用户之后,异步地将他发的Tweets从你的News Feed中移除

    • Pick out tweets from news feed asynchronously.

Store Likes

  • Tweet table
    • id, userId, content, createdAt, likeNums, commentNums,, retweetNums

results matching ""

    No results matching ""