Skjetlein

Monday, November 23, 2015

(Rolling) restart of elasticsearch datanodes

elasticsearch 1.7.3

Planned restart of the data nodes should include stopping the routing of traffic to avoid unnecessary rebalancing and prolonged recovery period when node rejoins the cluster.

Example:
Stop the routing:
PUT /_cluster/settings
{
"transient" : {
"cluster.routing.allocation.enable" : "none"
}
}

Should reply with:
{
"persistent": {
},
"transient": {
"cluster": {
"routing": {
"allocation": {
"enable": "none"
}
}
}
},
"acknowledged": true
}

Stop and do whatever you need to do, then start the node and wait for the cluster reporting the node rejoin in the logs:
[2015-11-23 01:18:32,623][INFO ][cluster.service ] [servername] added {[servername2][2DwlAl3SAe-aijdas1336Ew][servername2][inet[/1.1.1.2:9300]],}, reason: zen-disco-receive(join from node[[servername2][2DwlAl3SAe-aijdas1336Ew][servername2][inet[/1.1.1.2:9300]]])

When to re-enable routing is a question on how busy the cluster is. Reenabling will cause additional load and stress and myself have been delaying this for some hours until a suitable occasion appears. Must stress that the documents the rejoined node has will not be visible to the cluster nor will the rejoined cluster in any way offload the rest since data it has is not present (of course).

The cluster might also depending on shard and replication setting not be redundant during the routing allocation is off, what the overall status; orange ok during transition.

To reenable the routing
PUT /_cluster/settings
{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}

Watch the status to become green and continue to next node if needed. The time for the cluster to become green, that is in this context for all shards and replication criteria to be met varies highly with the capacity of each node, the overall load and not at least the number of documents. Eg. a 3 node cluster with 200M doc's should spend somewhere around 10 minutes for recovery, not more.

GET _cluster/health
{
"cluster_name": "clustername",
"active_primary_shards": 56,
"active_shards": 112,
"number_of_data_nodes": 3,
"number_of_in_flight_fetch": 0,
"number_of_nodes": 5,
"unassigned_shards": 0,
"number_of_pending_tasks": 0,
"timed_out": false,
"delayed_unassigned_shards": 0,
"relocating_shards": 0,
"initializing_shards": 0,
"status": "green"

}

Thursday, November 5, 2015

How to replace a headset

Replacing the headset is usually very straightforward and simple, given you have acces to the right tools which you will see in the pictures below. The two main tools are the extractor and the pressfit compressor. Teoretically one can do without these tools, but at risk of damaging the frame. The frame below is a Scott CR1 carbon and I would say that doing this job without the tools a damaged frame would probably be the result. The pressfit tends to sit very hard since the are to a difference from the bottom bracket in aluminium.

The reason for replacing the headset was a rusty lower bearing, the headset startet to bindly shortly after use and that lead me doubt the quality. I continued to use the bike rest of the last season and this, but it got worse and worse affecting the safety, especially noticable during fast decents.

Dismantle the stem in the usual way, first by removing the rings and the top.

Be sure to hold the fork while lossening up the stem to avoid it falling to the floor.

The fork should come loose and it some of the further service might require to dismantle the front break

Top bearing is looking good, almost no rust and do not really need replacement.

Inspect top and bottom to see if there are any wear, cracks etc in the carbon

I purchased a new complete high quality sealed BBB headset, the original Ritchey was not sealed and I believe that could have accelerated the wear and tear since replacing the headset should only be done every 3-5 years.

This is the pressfit extractor, it enables your to apply force on the pressfit itself and not the frame when hammering down.

Insert the extractor backwards

Be sure that the blades on the extractor engages inside on the edges of the pressfit. If a rubberhammer dont do - use a metall hammer, but be carefull and take your time to avoid damaging the frame

The bearing can usually be removed without forced since they are only held in place by the fork that is now removed. As you can see, the bearing was very rusty and no wonder why it caused binding.

Same process for removing the top pressfit, notice the metall hammer :)

Inspect the inside for reasons mentioned earlier. The white stuff you see inside the carbonframe is normal and are remains after the molding.

Wipe and clean

Notice the remains after the old bearing, this a seal and must be removed since it do not fit with the new bearing.

Removal was in this case hard, since the rust had glued it to the carbon

Careful prying got it moving after a while

There are different oppinions of what to do with the surface between pressfit and carbon. Some say to keep it dry, but I prefer to lube it up - in this case using lithium grease for longevity and resistance to moisture.

New pressfits alligned with the compressor aligned, be careful when tightning up to keep everything alligned.

New lower bearing inserted onto the fork

New top bearing inserted

Insert the spacers and thighten up by hand, be carefull not to tighten to hard. It should now bind when turning and there should be no slack.

There, done!

Go biking!

Wednesday, November 4, 2015

Elasticsearch and stemming

Main use of Elasticsearch is storing logs, and lot's of them. Searching throught the data suing Kibana frontend is awesome and one usually finds what one is looking for.

But lets have a bit fun by using the stemming in Elasticsearch.

Elasticsearch provides good support for stemming via numerous token filters, but lets focus on the Hunspell stemmer this time.

Els have the stemmer already (v1.7.3), but do not have the words. So first step would be to get the dictionaries and install them, not going into details here. Bounce the cluster (yep, all nodes) and be sure that the dictionries loads in nicely.

By default, newly created indices do not use stemming, thusly one have to set this when creating index.

put /skjetlein

{
"settings": {
"analysis": {
"filter": {
"en_US": {
"type": "hunspell",
"language": "en_US"
}
},
"analyzer": {
"en_US": {
"tokenizer": "standard",
"filter": [ "lowercase", "en_US" ]
}
}
}

If the dictionaries are missing from one or several nodes, you will receive a failure notice.

Othervise:
{
"acknowledged": true
}

Verify the settings

get /skjetlein/_settings

{
"skjetlein": {
"settings": {
"index": {
"uuid": "ny3n0uJMRKywpvy6OCRmLw",
"number_of_replicas": "1",
"analysis": {
"filter": {
"en_US": {
"type": "hunspell",
"language": "en_US"
}
},
"analyzer": {
"en_US": {
"filter": [
"lowercase",
"en_US"
],
"tokenizer": "standard"
...

Lets test the stemming,

get skjetlein/_analyze?analyzer=en_US -d "driving painting traveling"

...output should be something like this:

{
  "tokens": [
    {
      "token": "drive",
      "start_offset": 0,
      "end_offset": 7,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "painting",
      "start_offset": 8,
      "end_offset": 16,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "paint",
      "start_offset": 8,
      "end_offset": 16,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "traveling",
      "start_offset": 17,
      "end_offset": 26,
      "type": "<ALPHANUM>",
      "position": 3
    },
    {
      "token": "travel",
      "start_offset": 17,
      "end_offset": 26,
      "type": "<ALPHANUM>",
      "position": 3
    }
  ]
}

Ie. result is drive, paint and travel. Looks good.

So what can the usecase be in the context of elasticsearch, which usually is storing wast amount of logs (events) ? Well, lets say i search throught the logs for problems with filesystems. Elastic as-is would require search strings that include every possible word related to filesystem and since logs, in this context such as syslog do not provide the information is such a way that the search could be expressed in a constistant way.

Eg. filesystems can be stemmed to 'filesystem'

"custom_stem": {
          "type": "stemmer_override",
          "rules": [ 
            "ext2fs=>filesystem",
            "nfs=>filesystem",
            "btrfs=>filesystem"

...

or

           "postfix=>mail",
            "smtp=>mail",
            "qmail=>mail"

Friday, October 30, 2015

Logstash filters

It is not always a good thing to have options, many of them. I tend to start thinking about all the combinations, the plus and minus for all aspects. And how about the future, what possible negative consequences could it have if I choose a instead b now, how hard would it be to change back?

Diving into logstash, and not to mention logstash-forwarder (lumberjack) is a daunting task. Its not difficult or hard to understand, by dealing with all the choises are.

Recently I had a dilemma, no big thing, but anyway. When to set "type" ?

But the real question is why setting type? Well, the typical use it to tell logstash how to deal and process the data. Do we really need the the type setting? Not really, but it simplifies the configuration and makes it more readable too.

I hastly setup logstash-forwarder on a webserver with large amount traffic and by not really thinking about any technical/architectual decisions the type on the client. When working through the pipline and finally configuring on logstash I noticed that type was allready set, but not to exactly what fitted my need.

Type was set on client to apache-access, the access log needs their own type declaration since the log format is different from eg. error log. But on the logstash I had set this to the more general type 'apache'. I could not just change this since logstash was allready receiving data from other servers in production.

So back to options. A neat thing with logstash-forward is the annotation of the object sent, if data comes from a log file, the object is annotated whence it came from. Then with some grok'ing it's easy to filter objects based on not only the set type, but also source file name.

Eg.

filter {
  if [type] == "apache" {
    grok { 
      match => { "message" => "%{COMBINEDAPACHELOG}" }
      match => { "file" => "%{GREEDYDATA}.access.log" }
    }
  }
}

StrongSWAN for IPSec IKEv2 remote access server

Finding the configuration sweetspot to allow any client to connection is a lenghty and tiresome process, must due to the lack of documentation of the clients and also due to the wast amount of bugs that causes all kinds of weirdness and the needs for workarounds.

Strongswan is an awesome ipsec suite that have as far as I know the best opensource support for IKEv2 which is becomming more and more common and with Apple support it both on mobile and desktop os.

Just a few notes below on my findings that might help others on the way. My background is setting up a ipsec system based on Strongswan supporting a very large userbase with a lot of automation and even wrote a own 2 - factor system integrated on top of everything. Server side auth, client challenge.

OS X 10.11, IOS 8 and never

Certificates
The client never asks for the server certificate if it does not know what to ask for. That means a configuration profile is needed with proper CN set or a public certificate. I would recommend the latter for the ease of deployment.

SplitDNS
Having problem with getting the DNS pushed from the server working? The DNS payload is actually pushed from the server and installed/accepted by the client, check for yourself by running
scutil --dns
..but they are never used. The workaround is to use a configuration profile, with this you even get splitdns so absolutely worth doing.

config snippets from a profile

            <key>DNS</key>

            <dict>

                <key>ServerAddresses</key>

                <array>

                    <string>110.10.11.4</string>

                    <string>110.10.11.5</string>

                </array>

                <key>SearchDomains</key>

                <array>

                    <string>roger.se</string>

                    <string>skjetlein.no</string>

                </array>

                <key>SupplementalMatchDomains</key>

                <array>

                    <string>roger.se</string>

                    <string>skjetlein.no</string>

                </array>

            </dict>

Default encryption proposals os x 10.11

IKE:3DES_CBC/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024

IOS 9

IKE:AES_CBC_128/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024,
IKE:AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_1536,
IKE:3DES_CBC/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024

Microsoft Windows 7 & 8

Windows is a sad story with a dash of the typical Microsoft screw up standards and insane technical implementation.

The out of the box IKEv2, albeit one of the first movers, have some strange behavours that are worth mentioning.

NAT and DH2

works great. But if you change the Diffie Helman group to something else, the client will disconnect after approx 50 minutes. The reason is that windows want to rekey and when using nat, the rekay fails and the client disconnect.

Routing, TS and SA Child

Forget about TS, SA Child. Windows done use the TS and you need to use the capabillity accessible via GUI to set the following options

All traffic routed via vpn
Classfull routing
No traffic via vpn

Classfull routing is an odditty where the clients sets up a route based on the prefix of the assigned virtual ip. Eg. given an address on the 34.2.3.0 network, a 34.0.0.0/8 route will be installed. Why this? I dont know, but my impression after diggigs through the innards of windows is that this is not only remnants from the modem/ppp time, but is the main vpn framework.

No traffic via VPN forced you the set the routes manually after connecting. Either by running route add commands in shell or using the CMAK package from Microsoft, that will add the routes for you.

Default encryption proposals

IKE:3DES_CBC/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024,
IKE:AES_CBC_256/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024,
IKE:3DES_CBC/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_1024,
IKE:AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_1024,
IKE:3DES_CBC/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_1024,
IKE:AES_CBC_256/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_1024

Microsoft Windows 8.1 & 10

Gettings things running here is much easier and my prefered method is by deploying a powershell script that creates the vpn profile and sets ut correct routing and auth methods.

Example script

Add-VpnConnection -Name "Workplace" -SplitTunneling -ServerAddress vpn.workplace.ne -AuthenticationMethod Eap -EncryptionLevel Required -TunnelType Ikev2

Add-VpnConnectionRoute -ConnectionName "Workplace" -DestinationPrefix 1.2.3.0/24

Add-VpnConnectionRoute -ConnectionName "Workplace" -DestinationPrefix 2.3.4.0/24

Default encryption proposals

IKE:3DES_CBC/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024,
IKE:3DES_CBC/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_1024,
IKE:3DES_CBC/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_1024,
IKE:AES_CBC_128/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024,
IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_1024,
IKE:AES_CBC_128/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_1024,
IKE:AES_CBC_192/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024,
IKE:AES_CBC_192/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_1024,
IKE:AES_CBC_192/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_1024,
IKE:AES_CBC_256/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_1024,
IKE:AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_1024,
IKE:AES_CBC_256/HMAC_SHA2_384_192/PRF_HMAC_SHA2_384/MODP_1024