๐Ÿƒ Mongo DB Sharding

๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด์„œ ์–ธ์ œ๊นŒ์ง€๋‚˜ ์ธ์Šคํ„ด์Šค ํ•˜๋‚˜๋งŒ์„ ์‚ฌ์šฉํ•  ์ˆ˜๋Š” ์—†๋‹ค. ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ๋งŽ์€ ๋ถ€ํ•˜๊ฐ€ ๋ชฐ๋ฆฐ๋‹ค๋ฉด, ๋‹ค๋ฅธ ๋Œ€์ฑ…์ด ํ•„์š”ํ•˜๋‹ค. ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์กด์žฌํ•˜๋Š”๋ฐ, Vertical Scaling๊ณผ Horizontal Scaling์ด๋‹ค. Vertical Scaling์€ ํ•˜๋‚˜์˜ ๋จธ์‹ ์— ๋” ๋งŽ์€ RAM๊ณผ ๋” ๋งŽ์€ ์ฝ”์–ด ๋“ฑ์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. Horizontal Scaling์€ ์—ฌ๋Ÿฌ ๋Œ€์˜ ๋จธ์‹ ์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋ฅผ ๊ตฌ์„ฑํ•˜๋ฉด์„œ horizontal scaling์„ ํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ•์€ replica๋ฅผ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ์ด๋‹ค. mysql์˜ ๊ฒฝ์šฐ๋Š” read replica๋ฅผ ์—ฌ๋Ÿฌ๋Œ€ ์ƒ์„ฑํ•˜์—ฌ write๋Š” master์—์„œ ์‹คํ–‰ํ•˜๊ณ  read ์ž‘์—…์€ replicated๋œ ๋…ธ๋“œ์—์„œ ์‹คํ–‰ํ•˜์—ฌ ๋ถ€ํ•˜๋ฅผ ๋ถ„์‚ฐ์‹œํ‚จ๋‹ค.

mongodb๋„ ๊ทธ๋Ÿฌํ•œ ๊ฐœ๋…์˜ ๊ธฐ๋Šฅ์„ ์ง€์›ํ•˜๋Š”๋ฐ, sharding์ด๋‹ค. mongodb์˜ ๋ฌธ์„œ์—์„œ๋Š” ์•„๋ž˜์ฒ˜๋Ÿผ ์„ค๋ช…ํ•œ๋‹ค.

Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.

์‹œ์Šคํ…œ์ด ๋” ์ด์ƒ ๋ถ€ํ•˜๋ฅผ ๊ฒฌ๋””์ง€ ๋ชปํ•  ๋•Œ, sharding์„ ํ†ตํ•ด ๊ฐ€์šฉ์„ฑ์„ ๋Š˜๋ ค์ฃผ๊ณ , ๋ฒ„ํ‹ธ ์ˆ˜ ์žˆ๋Š” throughput๋„ ๋Š˜๋ ค์ฃผ๋Š” ๊ฒƒ์ด๋‹ค.

Sharded Cluster

mongodb์˜ sharded cluster๋Š” 3๊ฐ€์ง€ component๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. shard, mongos์™€ config server์ด๋‹ค.

Shard

shard๋Š” sharded cluster์•ˆ์—์„œ sharded data์˜ subset์„ ๊ฐ€์ง„๋‹ค. cluster์˜ shard๋“ค์— ์กด์žฌํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ•ฉํ•˜๋ฉด ์›๋ณธ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ๋œ๋‹ค. ๊ทธ๋ž˜์„œ ํ•˜๋‚˜์˜ shard์— ๋Œ€ํ•ด์„œ query๋ฅผ ์‹คํ–‰ํ•˜๋ฉด, ํ•ด๋‹น shard์•ˆ์˜ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋งŒ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค. cluster level์—์„œ query๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด, mongos๋ฅผ ์‚ฌ์šฉํ•˜์ž.

shard๋Š” ๊ณ ๊ฐ€์šฉ์„ฑ์„ ์œ„ํ•ด ๋ฐ˜๋“œ์‹œ replica set์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด์•ผ ํ•œ๋‹ค.

ํ•˜๋‚˜์˜ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์•ˆ์—์„œ primary shard๋Š” ๋ฐ˜๋“œ์‹œ ์กด์žฌํ•œ๋‹ค. primary shard๋Š” shard๋˜์ง€ ์•Š์€ ๋ชจ๋“  collection๋“ค์„ ์ €์žฅํ•œ๋‹ค. ๋‹ค๋งŒ, ์ด๋ฆ„์—์„œ ํ˜ผ๋™์ด ์˜ฌ ์ˆ˜ ์žˆ๋Š”๋ฐ, primary shard๋Š” replica set์˜ primary์™€ ๊ด€๊ณ„๊ฐ€ ์—†๋‹ค.

mongos

mongodb๋Š” ๊ฐ๊ฐ์˜ shard์— ๋Œ€ํ•ด query๋ฅผ ๋ถ„์‚ฐ์‹œํ‚ค๊ธฐ ์œ„ํ•ด mongos๋ผ๋Š” instance๋ฅผ ์ œ๊ณตํ•œ๋‹ค. mongos์— ๋Œ€ํ•œ ์—ญํ• ์— ๋Œ€ํ•ด์„œ๋Š” ์•„๋ž˜์ฒ˜๋Ÿผ mongodb ๋ฌธ์„œ๊ฐ€ ์„ค๋ช…ํ•œ๋‹ค.

mongos provide the only interface to a sharded cluster from the perspective of applications. Applications never connect or communicate directly with the shards.

์ ์ ˆํ•œ shard๋กœ routeํ•˜๊ธฐ ์œ„ํ•ด์„œ config server๋กœ๋ถ€ํ„ฐ metadata๋ฅผ ์บ์‹ฑํ•ด๋‘๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, persistent state๋Š” ์—†๋‹ค.

query๋ฅผ routingํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ๋Š” ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•ด๋ณด์ž.

config server

config server๋Š” sharded cluster์— ๋Œ€ํ•œ metadata๋ฅผ ์ €์žฅํ•˜๋Š” ์„œ๋ฒ„์ด๋‹ค. ๋ชจ๋“  shard์— ๋Œ€ํ•ด ์–ด๋–ค chunk๋ฅผ ๋“ค๊ณ ์žˆ๋Š”์ง€์˜ ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๋ฐ, ํ•ด๋‹น metadata๋ฅผ mongos์—์„œ ํ™œ์šฉํ•˜์—ฌ query๋ฅผ routeํ•œ๋‹ค.

๋˜ํ•œ ์ถ”๊ฐ€์ ์œผ๋กœ mongodb๊ฐ€ distributed lock์„ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด config server๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•˜๋Š”๋ฐ, ์ด๋Š” ์ž˜ ๋ชจ๋ฅด๊ฒ ๋‹ค..

config server์— ๋Œ€ํ•ด์„œ๋„ replica set์„ ๊ตฌ์„ฑํ•ด์•ผ ํ• ํ…๋ฐ, ์ด๋Š” ๋‚˜์ค‘์— ์•Œ์•„๋ณด์ž.

๋ณด์•ˆ

sharded cluster๋Š” ๋ณด์•ˆ์„ ์œ„ํ•ด์„œ internal authentication์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. mongod์— ๊ฐ๊ฐ ๋ณด์•ˆ ์„ค์ •์„ ๋„ฃ์–ด์ฃผ์–ด์•ผ ํ•˜๋Š” ์ ์„ ์žŠ์ง€ ๋ง์ž. ์‹ค์ œ๋กœ ๊ตฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” Deploy Sharded Clsuter wit Keyfile Access Control์„ ์ฐธ๊ณ ํ•˜์ž.

์ด๋ฅผ ํ†ตํ•ด ์–ป๋Š” ์žฅ์ ๋“ค

Read Write๊ฐ€ ๋ถ„์‚ฐ๋˜์–ด ์ž˜ ์‹คํ–‰๋˜๋Š” ๊ฒƒ๊ณผ ์ €์žฅ์†Œ๋ฅผ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์€ ๋‹น์—ฐํ•˜๊ณ , ์ œ์ผ ๊ถ๊ธˆํ•œ ๊ฒƒ์€ โ€œ๊ณ ๊ฐ€์šฉ์„ฑ์ด ๋ณด์žฅ๋˜๋Š”๊ฐ€?โ€์ด๋‹ค. ๊ทธ์— ๋Œ€ํ•ด ๋ฌธ์„œ์— ์„ค๋ช…๋˜์–ด ์žˆ๋Š”๋ฐ, ์•„๋ž˜์ฒ˜๋Ÿผ ์ ํ˜€์žˆ๋‹ค.

A sharded cluster can continue to perform partial read / write operations even if one or more shards are unavailable. While the subset of data on the unavailable shards cannot be accessed during the downtime, reads or writes directed at the available shards can still succeed.

ํ•˜๋‚˜์˜ shard๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์„ ๋•Œ, ๋‹ค๋ฅธ shard์— ๋Œ€ํ•ด์„œ ์—ฌ์ „ํžˆ query๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•œ๋‹ค.

์‹ค์ œ๋กœ ๊ตฌ์„ฑํ•ด๋ณด๊ธฐ

๊น”๋”ํ•˜๊ฒŒ ๊ตฌ์„ฑ์„ ํ•ด๋ณด๊ธฐ ์œ„ํ•ด docker๋ฅผ ํ†ตํ•ด์„œ ๊ตฌ์„ฑํ•ด๋ณด๊ฒ ๋‹ค. ์ผ๋‹จ container ์‚ฌ์ด๋ฅผ ์ด์–ด์ฃผ๊ธฐ ์œ„ํ•ด network๋ถ€ํ„ฐ ๋งŒ๋“ค์–ด์ฃผ๊ณ , docker image๋ถ€ํ„ฐ ๋ฐ›์•„์ฃผ์ž.

โฏ docker pull mongo
Using default tag: latest
latest: Pulling from library/mongo
...

~
โฏ docker network create mongo
0836403418d33db29b701e6911f641048d0a880720c88a6de4d3a9f3c4376bc5

~
โฏ docker network ls
NETWORK ID          NAME                                DRIVER              SCOPE
...
...
0836403418d3        mongo                               bridge              local

๊ทธ๋ฆฌ๊ณ , container๋ฅผ mongo1 ~ mongo7๊นŒ์ง€ ์ผœ์ฃผ์ž.

โฏ docker run -it --rm --net=mongo --name=mongo1 mongo bash

config server ๊ตฌ์„ฑํ•˜๊ธฐ

์šฐ์„ , mongo1, mongo2์—์„œ config server๋ถ€ํ„ฐ ํ‚จ๋‹ค. replica set์œผ๋กœ ๊ตฌ์„ฑํ•  ์˜ˆ์ •์ด๋‹ˆ replSet ์˜ต์…˜์„ ์ง€์ •ํ•ด์ค€๋‹ค. ๋‹ค๋ฅธ container์—์„œ ์ ‘์†ํ•  ์˜ˆ์ •์ด๋‹ˆ --bind_ip 0.0.0.0์„ ์„ค์ •ํ•ด์ค€๋‹ค.

root@bd14e1c615b0:/# mongod --configsvr --replSet config-replica-set --bind_ip 0.0.0.0

์œ„์—์„œ ์„ค์ •ํ•œ replSet์˜ ์ด๋ฆ„๋Œ€๋กœ replicaset์„ ์„ค์ •ํ•ด์ค€๋‹ค.

root@8b69f35de3b5:/# mongo mongo1:27019
...
...
> rs.initiate({
... _id: "config-replica-set",
... configsvr: true,
... members: [
...   {_id: 0, host: "mongo1:27019"},
...   {_id: 1, host: "mongo2:27019"}
... ]
... })

์ œ๋Œ€๋กœ ์„ค์ •๋˜์—ˆ๋Š”์ง€๋Š” rs.status()๋กœ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

shard ๊ตฌ์„ฑํ•˜๊ธฐ

mongo3, mongo4์—์„œ shard server๋ฅผ ์„ค์ •ํ•ด์ค€๋‹ค. replica set์œผ๋กœ shard-replica-set์„ ์„ค์ •ํ•ด์ค€๋‹ค.

root@8b69f35de3b5:/# mongod --shardsvr --replSet shard-replica-set --bind_ip 0.0.0.0

replicat set๋„ ์„ค์ •ํ•ด์ฃผ์ž

root@7d536b10b886:/# mongo mongo3:27018
...
> rs.initiate({
... _id: "shard-replica-set",
... members: [
...   {_id: 0, host: "mongo3:27018"},
...   {_id: 1, host: "mongo4:27018"}
... ]
... })

mongos ๊ตฌ์„ฑํ•˜๊ธฐ

mongos์—์„œ๋Š” ์‹œ์ž‘ํ•˜๋ฉด์„œ config server๋ฅผ ๋ฐ”๋กœ ์—ฐ๊ฒฐํ•ด์ค€๋‹ค. mongo5, mongo6์—์„œ mongos๋ฅผ ์ผœ์ฃผ์ž.

root@7d536b10b886:/# mongos --configdb config-replica-set/mongo1:27019,mongo2:27019 --bind_ip 0.0.0.0

config server๋ฅผ ์—ฐ๊ฒฐํ–ˆ์œผ๋‹ˆ mongo7์—์„œ mongo5์— ์ ‘์†ํ•ด์„œ ์•„๋ž˜์ฒ˜๋Ÿผ ์ ์–ด์ค€๋‹ค.

root@a5cadafbc76f:/# mongo mongo5:27017
mongos> sh.addShard("shard-replica-set/mongo3:27018,mongo4:27018")
{
  "shardAdded" : "shard-replica-set",
  "ok" : 1,
  "operationTime" : Timestamp(1555859895, 5),
  "$clusterTime" : {
    "clusterTime" : Timestamp(1555859895, 5),
    "signature" : {
      "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      "keyId" : NumberLong(0)
    }
  }
}

shard๊ฐ€ ์ œ๋Œ€๋กœ ๋˜์—ˆ๋‚˜ ํ™•์ธํ•ด๋ณด์ž

mongos> db.stats()
{
  "raw" : {
    "shard-replica-set/mongo3:27018,mongo4:27018" : {
      "db" : "test",
      "collections" : 0,
      "views" : 0,
      "objects" : 0,
      "avgObjSize" : 0,
      "dataSize" : 0,
      "storageSize" : 0,
      "numExtents" : 0,
      "indexes" : 0,
      "indexSize" : 0,
      "fileSize" : 0,
      "fsUsedSize" : 0,
      "fsTotalSize" : 0,
      "ok" : 1
    }
  },
  "objects" : 0,
  ...

replica set์— ์ œ๋Œ€๋กœ ๋“ค์–ด์žˆ๋‹ค!! mongo6์—์„œ๋„ ์ ‘์†ํ•ด์„œ ๋ณด๋‹ˆ ์ž˜ ๋œ๋‹ค.

๋

์ •๋ง ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ตฌ์„ฑํ•ด๋ณด๊ณ  ์•Œ์•„๋ณธ ๊ฒƒ์ด๋‹ค. ์‹ค์ œ๋กœ ์‚ฌ์šฉํ•ด๋ณด๊ณ ์ž ํ•œ๋‹ค๋ฉด ๋” ๊ตฌ์„ฑํ•ด์•ผํ•  ๋ถ€๋ถ„์ด ๋งŽ๋‹ค. ๋ณด์•ˆ๊ฐ™์€ ๋ถ€๋ถ„์—์„œ ์ข€ ๋” ์—„๊ฒฉํ•˜๊ฒŒ ์„ค์ •ํ•ด์•ผ ํ•  ๋“ฏ ์‹ถ๋‹ค.

April 22, 2019 ์— ์ž‘์„ฑ
Tags: db