I follow several Twitter feeds that tweet verses from the Bible. I whipped up the KJVTweeter Twitter account when I realized that the accounts I followed had some common afflictions:
- tweeting popular verses only
- selecting verses over 140 characters
- tweeting one verse at a time from the beginning
The popular verses are nice, but doesn’t help me learn. Longer verses require a click-through, which isn’t always desirable. They might be lesser-known verses, but for a medium like Twitter, you really want to focus on the tweet-friendly verses. Another account is beginning in Genesis and is estimated to be done in 83 years. I just need a simple account that tweets short verses in random order. KJVTweeter (github) does all of these things.
Behind the Scenes:
I found a copy of the King James Translation in text format. I wanted to find a version with shortened names of the books (tweet-friendly). The copy available at av1611.com had the book, chapter, and verse number, a newline, and then the verse. Some quick perl transforms the file into verses contained in a single line, and only prints if the verse is under 141 characters.
The transformation process after obtaining the file is short and sweet:
unzip -p KJV.zip | ./parser.pl | sort -R -o random_bible.txt
One interesting point is that the unzip program doesn’t accept STDIN, so prepending this pipeline with wget or curl won’t work.
It’s also been a while since I’ve come across a file with carriage returns. I was having a hard time figuring out why I couldn’t do something as simple as joining two strings. In the original version, I just used dos2unix, but it was just as easy to substitute out the return.
The tweeting shell script is run every hour. It takes the very first line from the verse file, tweets it, then removes the first line from the file. I was having difficulty figuring out how I’d select a random line from the file (shuf -n1) and later remove it. I originally pulled a random line, then used grep to get the line number, then used sed to remove that line number. It is much more efficient to sort the file upfront, then pull from the top. The perl to tweet the verse itself is a modified copy of the code available here: lukesthoughtdump.blogspot.com.
For this file, 16758 of the 31102 verses are tweetable — 53.88% of the Bible. The cronjob is set up to tweet once an hour, which means that it will finish after 699 days (1.91 years or 1 year and 334 days). It’ll be very easy to kick it off again at that time!