70 lines
4 KiB
Text
70 lines
4 KiB
Text
|
title: How i run my ceph cluster or how you can make a killer storage solutions for almost free
|
|||
|
---
|
|||
|
pub_date: 2020-01-08
|
|||
|
---
|
|||
|
author: ungleich
|
|||
|
---
|
|||
|
twitter_handle: ungleich
|
|||
|
---
|
|||
|
_hidden: yes
|
|||
|
---
|
|||
|
_discoverable: no
|
|||
|
---
|
|||
|
abstract:
|
|||
|
I wanted to store some data data and this is what I've came up with.
|
|||
|
---
|
|||
|
|
|||
|
body:
|
|||
|
|
|||
|
Low cost, high tech datastorage with ceph
|
|||
|
First of all why would you run a ceph cluster? Its complex, a bit time consuming and easier to lose data than using zfs or ext4.
|
|||
|
My reasons:
|
|||
|
its very easy to expand/shrink
|
|||
|
manage all your data/disks from 1 host (can be a security risk too)
|
|||
|
its fun
|
|||
|
we have it in production and it scales well
|
|||
|
unifying the physical
|
|||
|
|
|||
|
Step 1 :
|
|||
|
Find your local hw dealer.
|
|||
|
Second hand sites can be a good source, but good deals are rare. My tactics on ricardo.ch is: server & zubehor, filters: used, auction, max 1,5kchf, sorted with the ending soonest first) link : https://www.ricardo.ch/de/c/server-und-zubehoer-39328/?range_filters.price.max=1527&item_condition=used&offer_type=auction&sort=close_to_end
|
|||
|
Nearby dangerous material (ewaste) handler companies can be a goldmine. Big companies cannot just throw used hardware out as regular waste becuse electronics contains a little amount of lead (or some other heavy metal). So big compnies sometimes happy to sell it as a used product for cheap and ewaste companies are happy if they get more money than the recycled price / kg which is very-very low
|
|||
|
|
|||
|
low quality (core2duo era) pc-s also suffice, but you wont be able to do erasure coded pools are they use a ton of processing power and ram.. be careful with the ram, if you run out of swap/ram, your osd process will be killed, learnt it the hard way. also sometimes recovery uses more ram than usual so keep some free ram for safety.
|
|||
|
Put 10G nics on your shopping list, for performance, its absolutely crucial. I've started without it, and its certainly doable but it wont perform well. A little hack is to pick up gigabit nic cards (as some people give them away for free), and put them in an lacp bond. Note here: lacp doesnt make a single connection's speed better, the benefit is only realized at parallel connections.
|
|||
|
If you dont care or have equal disks by size or speed no worries, ceph will happily consume everything you feed to it (except smr disks* or strictly only for frozen data) One hack is to snag some old/low capacity disks for free. If you do everyting right you can surpass ssd speeds with crappy spinning rusts. Worried about disks dying? Just have higher redundancy levels (keep 2 extra copies of your data)
|
|||
|
My personal approach is to have coldish data 2 times, hot data like vms 3x and 1 extra copy of both on non-ceph filesystem.
|
|||
|
You can also group disks by performance/size. Ideally the disks should be uniform in a ceph device class, and equally distributed between hosts.
|
|||
|
Avoid hardware raid, use cards that allow full control for to os over the disks. If you must use hw raid, raid0 is the way.
|
|||
|
|
|||
|
Install:
|
|||
|
You can check out my ugly install script that meant to bootstrap a cluster on a vm.
|
|||
|
tested on an alpine vm with an attached /dev/sdb datablock (don't use legacy ip (ipv4))
|
|||
|
|
|||
|
apk add bash
|
|||
|
wget http://llnu.ml/data/ceph-setup
|
|||
|
bash ./ceph-setup $ip_address_of_the_machine $subnet_that_you_will_plan_to_use
|
|||
|
|
|||
|
Operation:
|
|||
|
I've never prepared a disks manually yet which i should definetly review, because Nico wrote amazing helper scripts which can be found in our repo: https://code.ungleich.ch/ungleich-public/ungleich-tools.git
|
|||
|
Some scripts still need minor modifications because alpine doesnt ship ceph init scripts yet. For the time being I manage the processes by hand.
|
|||
|
|
|||
|
Alpine's vanilla kernel doesnt have rbd support compiled in it atm, but for any kernel that doesnt have the kernel module, just use rbd-nbd to map block devices from you cluster.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
* https://blog.widodh.nl/2017/02/do-not-use-smr-disks-with-ceph/
|
|||
|
|
|||
|
Some useful commands:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Today we’re at Leipzig, Germany. Last day of [36c3](https://events.ccc.de/category/congress/36c3/), Chaos Communication Congress.
|
|||
|
|
|||
|
![](/u/image/xr-green.jpg)
|
|||
|
|
|||
|
## The earth is getting hotter
|
|||
|
|