========================================
== The Salopian Scientific Collective ==
========================================
It's just me

Using the ChatGPT Python library to make language-learning tool

ChatGPT AI Python
I’m learning German. There are so many AI-enabled apps for learning languages in the past few years with a multitude of features, but sometimes I want just one simple thing. German has a very different word order from English, and also a much stricter choice of words compared with English. It matters if you translate to change as wechseln, verwechseln, umstellen, andern, verandern etc. ChatGPT is great at writing fluent simple text in multiple languages, and choosing words that fit the full context of the text. Read more...

Efficiently handle slightly big data with Apache Arrow in R

R Apache Big Data
In systems biology, we often need to work with slightly big data. Not so big to justify setting up a database or using a high-performance cluster, but still a bit too big to comfortably work with in memory. We are talking about files in the 10 to 500 GB range, such as: Omics data like RNAseq or proteomics Single-cell phenotype data from high-content microscopy Large public data repositories, like the Human Cell Atlas The Arrow package for R lets us keep our data set on disk, dynamically loading only the rows and columns needed for our analysis. Read more...

Welcome to my weblog

Welcome to another data blog. These days it can seem like we are swimming in an ocean of AI-generated click-optimised content. I therefore decided to start this good old fashioned blog to share some insights and tips from my work as a systems biologist at the ETH (the federal technical university) in Zürich. Expect: Coding tutorials in R and Python Insights into systems biology and bioinformatics Anything else I think of There are no comments sections, subscriptions, adverts, sponsored links or cookies. Read more...
1 of 1