My brain extension
The public notebook of a computational neuroscientist
Sonntag, 14. August 2011
Interview on Sciple.org
Sciple.org has published an interview with me. Sciple.org is a relatively new platform which is about people in science, with a special focus on young scientists. It also has a "Future" section with interesting graphs... still growing, but interesting to watch anyhow. That part reminded me a bit of Seed Magazine, a magazine about science culture, the print version of which I subscribed to some time ago and which I enjoyed quite a lot. I still read their RSS feed occasionally.
Donnerstag, 7. Juli 2011
AAAARRRRRRGH!! Matlab!!
Happy Matlab licensing trouble - yay!
We have 5 Matlab licenses in our lab. Originally, we bought them as "Concurrent licenses", that is, Matlab allowed us to have 5 instances of Matlab running in parallel, regardless which computer they are running on. At some point, Mathworks somehow transformed these licenses into "Designated Computer" licenses, that is, each license must be associated with a designated computer and will only run on that machine. Although this was an obviously bad deal, we didn't care too much about that change back then, since we were busy doing more important stuff than caring about licensing issues.
We have 5 Matlab licenses in our lab. Originally, we bought them as "Concurrent licenses", that is, Matlab allowed us to have 5 instances of Matlab running in parallel, regardless which computer they are running on. At some point, Mathworks somehow transformed these licenses into "Designated Computer" licenses, that is, each license must be associated with a designated computer and will only run on that machine. Although this was an obviously bad deal, we didn't care too much about that change back then, since we were busy doing more important stuff than caring about licensing issues.
Anyway, Matlab is a dying species in our lab since most of us are using Python for scientific computing, except for a few legacy scripts. But every now and then, I need to run one of those legacy scripts.
I do much development on my laptop, but for numbercrunching I use our compute server. Hence, I need my computing environment on both machines, although not necessarily at the same time. I had one of these designated computer licenses, and thanks to Mathworks' provident care, I was able to deactivate and reactivate them over the web when switching between computers. So I changed the designated computer a few times between those machines. Today I wanted to change again, but Mathworks wouldn't let me:
"No more machine transfers available for this license."
WTF?
OK, you're forcing me to port even my old scripts to python. Pity you. I spent already too much time struggling with licensing issues - time which I would much more like to spend on research. Goodbye Matlab.
I do much development on my laptop, but for numbercrunching I use our compute server. Hence, I need my computing environment on both machines, although not necessarily at the same time. I had one of these designated computer licenses, and thanks to Mathworks' provident care, I was able to deactivate and reactivate them over the web when switching between computers. So I changed the designated computer a few times between those machines. Today I wanted to change again, but Mathworks wouldn't let me:
"No more machine transfers available for this license."
WTF?
OK, you're forcing me to port even my old scripts to python. Pity you. I spent already too much time struggling with licensing issues - time which I would much more like to spend on research. Goodbye Matlab.
Freitag, 1. Juli 2011
Using Python decorators to work around version incompatibilities
I'm using PyNN to simulate networks of spiking neurons. PyNN ist a "metasimulator" than can operate with several simulator backends, such as NEST, NEURON, or several others. The cool thing is that PyNN also has a backend for the FACETS hardware, which I'm using in a project. I can prototype the simulation in the simulator, and run it on the hardware afterwards, without changing my simulation script.
In theory.
In practice, things are a bit different. The hardware interface works with PyNN version 0.6, but PyNN has progressed towards 0.7 already. The current version of NEST works only with the current development version (0.7+, that is). This caused some headache for me and others developing for the hardware. Update: Some people wondered and asked me why I wouldn't simply use the old version of NEST that works with 0.6. Well, I could, but actually, that version has other bugs which make this solution a no-go.
Fortunately, the API changes between PyNN 0.6 and 0.7 are not so extensive, so one can work around the differences with relatively little code. Still, one wants to have an elegant way of automatically detecting the PyNN version and using the appropriate code automatically.
Python decorators are particularly well suited for that purpose. Python decorators are functions or classes that return a function. Using a decorator, you can check for the PyNN version in the decorator function and return the appropriate function which does what you want in the current PyNN version.
Confused? OK, here's an example: Assume that I want to retrieve the IDs of all cells in a population. In PyNN 0.6 I must use
while in PyNN 0.7 I can use
Now I want to have my script automatically figure out which function to use based on the PyNN version which is used. And here comes the decorator into play:
Now I have simply to define a dummy function which is to be mangled through the decorator:
So, calling get_population_ids is actually first calling pynn_version_workaround, which determines the pyNN version and returns the appropriate function, which is then called with the provided arguments.
def get_population_ids_06(pop):
return [id for id in pop.ids()]
while in PyNN 0.7 I can use
def get_population_ids_07(pop):
return [id for id in pop]
Now I want to have my script automatically figure out which function to use based on the PyNN version which is used. And here comes the decorator into play:
def pynn_version_workaround(pop):
if pynn_version.split(' ')[0] == "0.6.0":
return get_population_ids_06
else:
return get_population_ids_07
Now I have simply to define a dummy function which is to be mangled through the decorator:
@pynn_version_workaround
def get_population_ids(pop):
pass
So, calling get_population_ids is actually first calling pynn_version_workaround, which determines the pyNN version and returns the appropriate function, which is then called with the provided arguments.
Nice, isn't it?
Dienstag, 17. Mai 2011
Any jackass can trash a manuscript...
It seems I'm not the only one getting hilarious reviews from time to time. The Journal of Molecular Biology of the Cell (MBoC) has published an editorial that speaks from my heart, titled "Any jackass can trash a manuscript, but it takes good scholarship to create one (how MBoC promotes civil and constructive peer review)".
In my opinion, one of the most important points in the article is that the relentless bashing which has become a recurring feature of many reviews will, inevitably, hurt the entire research field, because it destroys the scientific community in that field.
As I said before, I think that an open peer review process with identifiable reviewers will foster constructive criticism in the reviews. The reviewers will become visible and their contribution to the community acknowledged. The whole process will become more transparent, which is a prerequisite for a functioning scientific community.
Labels:
life science,
peer review,
publishing
Dienstag, 1. Februar 2011
SEED Magazine: The scientific paper is becoming obsolete
SEED Magazine, a New York-based magazine on science culture, just published an interesting article about how science publishing is about to be transformed by the internet. You might think that this is an old hat, and in fact the idea that the internet revolutionizes the way we publish and access scientific results is not new. Indeed, "the Internet" was "invented" by scientists to share knowledge. Yet still, subscription costs rise, although the price of knowledge dissemination via the internet is much cheaper than in printed media. The market has obviously failed. Access to scientific results has become expensive, so expensive that the main funders of science (tax payers) only rarely can access to the knowledge which is produced using their money.
More importantly, limited access to scientific results directly harms scientific progress. It is not unusual that it takes about two years until a paper is published. By the time a scientific breakthrough is published, it often fails to make real impact, apart from discouraging other labs to work in that direction.
The tools to overcome the limitations are all there: Scientific results can be published on preprint servers (like arXiv or Nature Precedings) right after writing them up. The stream of information coming out of those servers can be filtered by the scientific community by writing blog posts, or by commenting on the preprint servers directly. So why are these tools still so rarely used in life science?
The answer is simple: Lack of incentive. Writing blog posts does not extend my contract, papers on preprint servers do not increase my university budget (opposed to papers in high-impact journals), and often the possibility to publish in popular journals is compromised by publication on a preprint server.
However, incentive will rise as more and more researchers get frustrated by the corporate science publishing machinery. As more and more university libraries drop out from subscriptions as publishers increase their fees, researchers focus on open access journals. As it is becoming more and more difficult to publish in high-ranking journals, researchers consider alternatives which enable them to spend more time on research and less on getting bashed in anonymous peer review.
For example, PLoS ONE is very successful with publishing papers reviewed for technical correctness, but leaving it to the reader to gauge it's scientific impact. Recently, even Nature publishing group picked up the idea and started it's own version of PLoS ONE, Scientific Reports.
While these journals make it easier to publish one's findings, it is up to the researcher to make an impact in terms of influencing the field. Doing good research takes you only half way. The other half consists of convincing other researchers about one's ideas. The great advantage is that this process takes place in public, while in anonymous peer review it is hidden from the largest part of the scientific community.
Donnerstag, 25. November 2010
Towards fast scientific python
Python seems to come of age in its role as an universal language for scientific computing. It already has a good standing in the computational neuroscience community. The Neural Ensemble project gathers some initiatives that use Python as the primary language for neuronal simulation and data analysis. Large simulator projects like Nest and NEURON adopted Python as their primary command language already a few years ago. The core of those simulators is still written in C/C++, which delivers good performance, but leads to interfacing issues with the command language. Those issues can be addressed by clever software design, but a pure-python implementation of a simulator is much more convenient regarding maintainability and extendibility. The problem is, that pure Python will lag behind the speed of compiled languages like C/C++ by an order of magnitude.
The Brian simulator is designed to be a simulator written entirely in python. To cope with the speed of C/C++-based simulators, Brian can generate compiled code from the python network model. This code can also be compiled for graphics processors (GPUs), which promise high speedups for computational problems that can be parallelized efficiently. The Brian developers describe how to do just that in their article on vectorized algorithms for neuronal simulations, which is one of my current favorite papers.
Today, and that was the initial motivation for this post, I came across the announcement for the new version of Theano, a compiler for evaluation mathematical expressions on CPUs and GPUs. I haven't tried it out yet, but it looks definitely promising. But the really interesting fact is that there is vivid development toward making Python not only an ubiquitous language for scientific computing (a goal which has largely been achieved already), but also an alternative in terms of performance to established software packages.
Without licence fees, and fully open source.
Labels:
neural simulation,
Neuralensemble,
numpy/scipy,
Python
Dienstag, 23. November 2010
PNAS Editorial: Impact Factor corrupts science
The (ab)use of the impact factor to evaluate the scientific merit of individuals corrupts the way how scientists publish their findings, say Eve Marder, Helmut Kettenmann and Sten Grillner in their recent editorial to PNAS. Moreover, they state that the current practice to measure scientific achievement shifts the choice of research topic to potentially "great discoveries" (read: discoveries which will make it to Nature), although the most important findings in science were made serendipitously, and hence the eventual contribution to science could not be estimated beforehand.
However, in my opinion, the impact factor is only the tip of the iceberg. Even worse is the implicit role of author sequence on a paper. In life sciences, the first author typically is the one who did the work, and the last author is the supervisor or lab head. All authors in between are perceived to be "minor contributors". Of course, this rule leads to all kinds of problems. Fierce battles are fought over author sequence, since for PhD students, only first-author papers count, while for group leaders last-author papers are vital to demonstrate their scientific contribution.
But there can only be one author first, and one author last. Of course, there are "equal contribution" asterisks all over the place, but are they actually been taken into account? After all, how much sense does it make to refer to the deprecated, intransparent and inflexible rule of author sequence to indicate contribution? For example, it is completely unclear how to handle interdisciplinary collaborations, which involve typically at least two PhD students and two group leaders.
A completely fair and unbiased way to state individual contributions to a scientific publication would be to list the authors in alphabetical order and have an "Author contribution" section in the paper, where the individual contributions are described in detail. In fact, this is how many disciplines handle it, for example in social sciences.
Labels:
author contribution,
impact factor,
life science,
publishing
Abonnieren
Posts (Atom)