By default elasticsearch runs assuming a one machine, one node setup (You specify node data in elasticsearch/config/elasticsearch.yml), so what happens if you want to run multiple nodes on one box, say, you want to play with multiple nodes on your dev machine?
The easy answer is that you could create multiple elasticsearch.yml files (elasticsearch.0.yml, elasticsearch.1.yml etc etc) and then start each instance from the command line referencing the new config files.
usr/local/bin/elasticsearch -fD es.config=/usr/local/Cellar/elasticsearch/0.xx.x/config/elasticsearch.0.yml
usr/local/bin/elasticsearch -fD es.config=/usr/local/Cellar/elasticsearch/0.xx.x/config/elasticsearch.1.yml
That should get you most of the way there (the new node comes up on port 9201), but if you have any problems and need an alternative read this detailed response on Stackoverflow
I’ve been doing a lot of Elasticsearch work at my fulltime job and I’m liking it very much (Actually in San Francisco for an Elasticsearch conference right now). That being said … I started reading this great article by Jon Tai about how to use Elasticsearch as a supplement to your database to get quicker results for unstructured/complex queries, then I started to look at the rest of his blog posts about Elasticsearch and quickly realized that if you’re trying to get up to speed with Elasticsearch, there isn’t clearer, more easily digestable writing on the web about the basics of Lucene and Elasticsearch.
Trust me, I know. I’ve been screwing with ES for the last six months or so, and the knowledge I have is pieced together is from numerous google searches, Stackoverflow questions, random one-off blogposts about Elasticsearch, Tire or/and videos from the Elasticsearch site.
So once you actually get ES setup on your dev machine, go get yourself a good cup of whatever and then snuggle up with the following (in this order).
- Testing Lucene Analyzers with elasticsearch
- Lucene Scoring and elasticsearch’s _all Field
Then watch this 40 minute video by Elasticsearch creator, Shay Banon, that explains the way Elasticsearch is designed and how to use it to your advantage
- Big Data, Search and Analytics (I’ve watched this 3 times since last August and I pick up something new each time)
Ever been scrolling through your tumblr, for what seemed like hours, but didn’t want to stop because you’d lose your place without getting to where you stopped the last time?
I love tumblr, but this bugged me so much that I hacked together a Tumblr Timestamps Chrome extension that tells you exactly when a tumblr post was published (slots it in the lower left hand corner of every post). This way you can keep track of where you start or leave off, and (hopefully) better manage how much time you spend on tumblr.
A lot of times, I’ll get on the phone to check my balance or do something routine while I’m in my office (which I share with 2 other people), but the customer service menu navigation is ONLY voice activated. Since I don’t want to disturb my co-workers, I either have to stop what I’m doing and leave the room to find a quiet place to yell instructions at my phone or just remember to do it later (which I never do). Apart from the potentially poor user experience (slower/inconvenient way to get through a menu you’re already familiar with), it simply is a massive pain in the ass sometimes.
What’s so frustrating is that this can easily be fixed by giving the user the option of hitting a button to revert to the number pad for navigation. But then again, if a company has a voice navigated customer service menu, they probably don’t really give two ____s about what’s convenient for you.
Interview after interview with some of the world’s most successful people—actress Laura Linney, Zappos CEO Tony Hsieh, crossword mastermind Will Shortz—they began seeing patterns emerge. No matter how diverse their goals or crafts, these super-achievers shared many of the same habits. How can you follow in their footsteps?
Read the article here or watch this short and sweet summary (recommended) …
This one is really nerdy, but it would be cool if in the “About Google Chrome” page of the browser, a list of whats new or what’s changed showed up right after the version number.
This could be restricted it to only if the user was on the dev/beta channel. I figure if you’re on dev/beta, you care about that kind of information.
Even cooler would be a list of all the versions that you previously used, that folded out on click, so you could see your own specific upgrade path (along with their changelogs)
Watching this 60 minutes episode (scroll to end of post) got my mind running in a hundred different directions. Its clear to me now that the future of software is Big Data Analytics and Machine Learning. In the future, people aren’t going to just want the software we’re churning out right now. They’re going to want software that learns their preferences and adjusts to their needs.
Manually set my alarm before I go to bed? pffft, the ios alarm app in 2020 will monitor the time you went to bed and use your past behavior to know that you need to be up in 4 hours.
The other thing that I realized is that the American economy is slowly moving to a phase where high skilled jobs will make up most of the job market. These jobs will be highly paid, but will also have to be highly taxed to support the rest of the country that simply will not have work.
We live in interesting times
I restarted Elasticsearch and started getting a nasty stack trace in my elasticsearch logs, the key line being
failed to connect to master [[Buzzard][bC1NWlbVT8Wnq7adl3VetA][inet[/192.168.1.2:9300]]]
There was no ip address like that on my network, it was maddening because no matter what I tried, it kept trying to find that non-existent master node.
Turns out that older versions of elasticsearch (I’m running 19.2 … current version is 20.x) have that problem where stale master id information can be broadcast over the network by a client node. This probably happened because I took my laptop home from work and did a restart of elasticsearch at home (different network/ip address etc)
Eventually I found the solution here.
If you’re getting this error when you go to startup elasticsearch, multicast is probably not working properly. I’m running elasticsearch on a single server (dev environment) and didn’t need all the ceremony.
So I just went into elasticsearch.yml (mine was in /usr/local/Cellar/elasticsearch/0.19.2/config/elasticsearch.yml) and set
thats was it. Elasticsearch came right back up!
In very unscientific tests Ruby 2.0 is 60-70% faster than 1.9.3.
PS: I couldn’t get my test suite to run in Ruby 2.0.0 but I managed to run the very simple
“time bundle exec rake environment” test
The average was
- 7.74s for ruby 2.0.0
- 11.8s for ruby 1.9.3-p368, a 65% speed improvment, right in line with the results from the gist
- 17.07s for rubinius 2.0.0 (1.9 variant) … yeah thats slow
We just moved to the latest Test::Unit, which uses MiniTest under the hood. Because of that, I wanted to take advantage of the new MiniTest::Spec DSL to get RSpec-like syntax in Test Unit.
The problem is, when I went to the try using it, I started getting all sorts of strange duplicate key/column errors.
At first I thought it was because of a recent upgrade I had done from Postgres 9.1 to 9.2, but as I dug in deeper, I realized it was my new Spec Test. We are running our tests in transactions that we rollback on the completion of each test, MiniTest::Spec wasn’t doing this and causing all the test errors.
After hours of digging I finally found the solution to the minitest duplicate key issue here. I hope it saves you some time. Definitely did for me!
PS: The other alternative that was pointed out to me was to simply use the minitest-rails gem. I tried it but it didn’t work cleanly right out-the-box for our setup (errors in tests, etc). I was too tired to fix it up, so I’ll kick the can down the road till I need to revisit this issue, probably in upgrading to Rails 4.