November 28th, 2011. No Comments.
So suppose you have an NCBI BLAST database, and you don’t have the fasta file from which it was created, or for any other reason you need to dump a BLAST database to a fasta file. Further suppose you are one of the 10 or so people on earth who have this problem and aren’t me. Maybe you tried a few things and got this error:
BLAST query/options error: Must specify query type
Or maybe you did the smart thing and googled it first. What you need is ‘blastdbcmd‘. Use it like so:
blastdbcmd -db my_blast_db -entry all > blast_db_dump.fna
A note about DB names: makeblastdb creates three files for a single db, e.g. for db named my_blast_db, it would create my_blast_db.nhr, my_blast_db.nin, my_blast_db.nsq. In the above command, you would only use my_blast_db, with no extension.
Rearrangement of Viamics data storage
November 22nd, 2011. No Comments.
So in order to accomplish some of the things we’ve wanted for a while now in Viamics, the plan is to change the server to incorporate a relational DB where the analyses will be stored. Here’s how the current design of this database looks:
- Process for running an analysis would keep the fasta file around while doing BLAST/RDP, then add seqs to DB (compressed, obvi) and remove the file. This could be accomplished with new/rewritten modules, and changing a few assumptions in server.py
- Other analysis types e.g. QPCR, env might work by simply leaving sequence and/or sample blank as appropriate
- In order to accomplish this, the modules and tools responsible for running analyses and handling data storage will be heavily altered. This will allow for more efficient handling of different analyses of the same data. The standard technique for classifying sequences which confuse RDP is to run them again in BLAST, and the design above is intended to allow de-duplication at the sequence level.
- Storing data in this form will lay the groundwork for any kind of comparison/sample map we might need.
As noted above, one thing I’m not quite sure about is how to handle the various types of clinical data we might need, in such a way that we can take advantage of the DB querying mechanisms to query samples by age, sex, healthy/sick, etc. I picture these as possibly another table which would belong to the samples table, and have columns sample_id, attribute_name, categorical/quantitative, value (or maybe two value cols for cat/quant). This feels weird because really I want to be able to do something to the effect of:
SELECT * FROM samples WHERE sample.id == id AND sample.sex == male
(pardon my ignorance of SQL).
This brings up the other way I’ve thought about, which is to include a bunch of columns in the samples table, which would be labeled something like attr1, attr2, etc. since we don’t know in advance which attributes we’ll need to store for different analyses. Maybe then the analysis table could have a mapping (stored as say, json in a text column) to tell which attribute columns are used, their names, and whether they are qualitative/quantitative. This seems like it should be consistent at the analysis level, but definitely differ between them. Another benefit of this approach is we could add more columns if the need arose, and older analyses would not mind.
I can certainly imagine that I’m reinventing some common pattern here, so let me know if that’s the case.
What I want my bill-paying experience to be like
January 26th, 2011. No Comments.
Dear lenders, utilities, SaaS, and other people who apparently need my money for society to keep functioning,
All of you seem to want me to sign up for automatic recurring payments, and I see your point. I’m sure people who sign up for these things are more reliable about paying your bills, because to be honest paying my electric bill comes somewhere between “make a sandwich” and “favorably compare myself to my exes boyfriends on facebook” on my priority list. There’s a few reasons I never do it, which I think you could fix pretty easily and get me on board. Don’t just do it for me though, I’m sure there are loads of people in the same situation.
First, it just makes me nervous to let someone else withdraw from my account. Nothing rational here, just don’t like it. When someone says “give me your account number so I can take out what you owe me” my default will always be “no”.
Second, I like the fact that I’m reminded every month of certain expenses. Maybe one day I’ll no longer need your service and in that case I want to make it as easy as possible to remember to cut you off.
What this all adds up to is I want to have to take some action, however small, before anything leaves my account.
This how I want the process of paying my bills to go:
First, you somehow contact my bank. You are all constantly trying to get me to sign up for automatic recurring payments, so I know this is implemented already. My bank then contacts me through (say) a text message. Then all I need to do is reply with a password, and the payment gets taken care of.
A bank could build this today. From an architecture point of view, (Warning: ignorant speculation) the bank would have my public key and would release payment when presented with a message containing the date and amount signed by the private key. Then all the merchant would need to do is send me a message and request permission, and we could reduce my end to pressing an “approve” button.
Dump a MySQL DB (e.g. for upload to a wordpress on bluehost)
December 20th, 2010. No Comments.
mysqldump db_name -p > file.sql will dump the database to a sql file (a set of SQL commands that when run, produce the exact database given), after prompting for a password. This should be the mysql password, and it will log in to mysql with the username of the current Linux user. Using a different username probably looks something like: mysqldump [db_name] -u [username] -p > file.sql but I didn’t need to use that so YMMV.
For a wordpress DB, the username and password are stored in wp-config.php.
THIS is how you write a reverse job application
October 25th, 2010. 6 Comments.
Last week a page made its way through my corner of the internet. It was written by a young man named Andrew Horner frustrated by his post-college job search. He says:
This is a reverse job application. I am done asking people to hire me, for several reasons. First and foremost, it clearly doesn’t work. Second, it closes me off to a lot of potentially amazing opportunities; I can only find and apply to so many jobs, and there are doubtlessly hundreds of thousands out there that I would be a great fit for. Third and finally, the application process undermines my value as a worker. I have gone my entire life consistently producing excellent results at every task I set my mind to, and quite frankly, employers should be coming to me, not the other way around.
He then proceeds to explain (in a very entertaining style) why people should offer him jobs, and directs them to an application form at the bottom of the page.
Aside from the fact that hundreds of thousands of people and businesses are doing exactly the same thing as him (we’re called contractors and it’s called advertising) in industries from photography to roofing, he seems to forget one other thing; Mr. Horner, like me, seeks employment in the field of software development. The cool thing about this field, as I’m about to demonstrate, is that there is basically no barrier to entry to start producing observable results. No one needs to give you a license, put you in a cockpit, give you a truck, or leave you mineral rights in their will. Anyone with a computer and the skills can just make something, and those that can, usually do.
So here’s my “Reverse Job Application”. NOLAladies.com is a webapp I made this weekend that reads the Foursquare data stream and displays on a map the locations in New Orleans (where I live) which have the highest female-male ratio currently checked in. I did everything from designing the database schema to fiddling with divs and margins to make everything line up right. A few programmers and tech companies have made similar products for other cities recently and drawn some good press attention. If they haven’t already, whoever makes a more polished version of this with some dating site affiliate links and the ability to show data from most of the US will likely have the beginnings of a decent revenue stream. Right now though, it’s a simple rails app that does the simple job of demonstrating I can make something people want.
I know lots of people out there need my skills. I’m currently living in Louisiana at least until spring of 2012 when I finish my degree, so you would either be here or hire me remotely. The ideal situation would probably be a contract job for a few months.
I can be reached me at Johnny at [this domain].com
Use grep to cheat at scrabble
October 12th, 2010. No Comments.
grep ‘^[umnhlp]\{1,3\}e\b’ /usr/share/dict/wwfriends …
more coming soon
