View this page in Romanian* German Russian Slovakian translation created by Sciologness Team
The OS is pretty much agnostic for data science. There are tools for each platform. Professionally you will probably be using Windows as its still the king in the business world. We use a SQL Server stack at my office (SSIS,SSRS,SSAS,etc). OSXGeoCalc is a Structural Geology Calculator. It operates on lines and planes, strains, and stresses. Individual computations can be made using the calculator's interface. Multiple computations can be performed using text files. If you get a Mac, you can quickly run all the main operating systems, which is a big plus for those learning programming. It's difficult to run OS X on a Windows PC (or Linux PC), and you need to.
There's already a lot of OS X stuff on the web. This page is simplya cookbook of how I used those resources to put together mysystem. What follows is intended to be step-by-step instructions fornew users to set up their OS X Mac to work in a Unix environment. Assuch, it is not for everyone. I also don't intend to be comprehensivein listing all the resources that are dealt with elsewere, but dopoint to links where relevant.
I have not updated for each OS X release, but most things here are genericto any release since 10.2
For current HPC resources for OS X, see http://hpc.sourceforge.net/.
OS X puts all the standard commercial software tools (Word,Illustrator, Matlab) and a unix programming environment seamlesslyinto a single box. This poses many advantages to scientificprogramming relative to other desktop systems based on Windows, Linux,or Unix. While many unix and linux packages exist for documentpreparation, there are many reasons one might want to have access tothe standard commercial tools that run only on Windows and Macintoshcomputers (eg, from Microsoft and Adobe). In the past, however, integrating these computers into theprimary computing system has been awkward.
OS X'sunix core allows nfs mounts, X windows, and most of the common Unix/Linuxsoftware tools can be run on the Mac so the Mac can be just anotherUnix computer. Fortran or c code can be directly shared between themac and other platforms. So, one needs no longer to decide betweenhaving a unix system or a office system on one's desk; OS X does both.
Apple has a page on Unixdevelopment, which is a good place to start.
As shipped, OS X is setup for the non-technical traditionalMac user, not like a unix box. System administration is also notexactly the same as a unix box. So, a few extra steps need to be made to get it to all work.
Here are the things I needed to do to make my Powerbook functionfor me using open-source tools. There may be better ways around, butwithout much documentation, this is what I needed to do. (Wherever Igive command-line instructions (ie within the shell by runningTerminal), the % sign is the prompt for a normal user. The sudocommand gives you administrator privileges and can only be run bysomeone in the admin group.)
Install X windows.
Apple supplies an X11 emulating application, but it is not part ofthe standard installation. If installing from CDs, choose a custominstallation and you will be given the option for X11. For newcomputers, an X11 installer is probably somewhere on the hard drive.
You can also run Xfree86 from the fink project (see below).
The Mac uses a different binary format from Intel, AMD, and Alpha systems. The G4 uses so-called 'big endian' byteorder, as do most other Unix platforms. x86 (Intel, AMD) and Alpha (Dec/Compaq/HP) use 'little endian'. This can causeproblems with sharing binary data among platforms. The best solution is to use a platform-independent self-describingformat such as netCDF or HDF or even gzipped ascii.
NetCDFis available for OS X from the above link or from fink These OS X ports only support the fortran-77 interface via g77. If you use a commercial fortran-95 compiler, such as NAGware,compile yourself according to these instructionsin order to get the f90 interface.
HDF is also available for OS X from HDF web site linked above orfink.
For dealing with raw binary, read this article.
Optimized math libraries.
If you do any matrix operations on a G4/G5, you'll want to use LAPACK and BLAS libraries that take advantage of the vector processor.
Macs are now included in the Automatically Tuned Linear Algebra Software (ATLAS) package. There is even a precompiled G4 library here. ATLAS contains a full tuned BLAS and a partial LAPACK.
Since OS X 10.2, Apple includes optimized ATLAS/Lapack libraries in the vecLib framework.
Both NAG f95 and g77 accept the -framework veclib flag to link to veclib. Eg:
Absoft has directions here.
Install other tools as needed. Complete lists exist elesewhere on the web, such as:
Here's some specific stuff I like:
Emacs text editor:
I've found that the best emacs for OS X is theGNU Emacs in it's new carbonversion. This is just beginning to be supported, so there are noofficial binaries. Your best bet is to build it from source or try to find a reasonably current binary on line.
Two good binary downloads are linked by Apple:
To build from source, download the source from CVS:and compile:
For X windows, Xemacs is a good option, and is available from fink).
NetCDF: This is an essential package for atmospheric science datasets. If you use g77, you can get it from fink. If you use a fortran90 compiler, you'll need to download the source from and build it yourself. It builds nicely with NAG f95 following this proceedure.
X-Y plots: grace (install using fink)
Plotting atmospheric data fields:
Simple command-line scripting package: grads. OS X version from Whit Anderson at COLA.
NCAR Graphics is available as pre-built binaries. These use g77, but can be used with XLF or NAG compilers if you include the g2c library when you compile (ie add -lg2c).
Adobe Illustrator. EPS files from Grace, GrADS and NCAR Graphics (eps or cgm) are easily opened and edited with Illustrator.
Microsoft Office. Adobe eps files can easily be incorporated into word documents.
Set up printers. You can add networked lpr printers hanging off unix boxes using Print Center if the printers don't support Appletalk. If PrintCenter crashes, then read fixes for Print Center.
Many of the popular tools used by atmospheric and ocean scientists for data analysis and visualization are available on OS X. These include:
Student version $99, Commercial package $1,900 - Many plugins available
Free binary download. The fortran libraries are built using g77, but can be used with XLF or NAG compilers if you include the g2c library when you compile (ie add -lg2c) and the -qextname flag for xlf.
NCAR Graphics is a Fortran and C based software package for scientific visualization.
Free binary download
NCL is a programming language designed specifically for the analysis and visualization of data
Free binary download
The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data
free source download
NetCDF (network Common Data Form) is an interface for array-oriented data access and a freely-distributed collection of software libraries for C, Fortran, C++, Java, and perl that provide implementations of the interface. NetCDF is the standard format for distribution of atmos and ocean data worldwide.
Free source code download
Vis5D is a software system that can be used to visualize both gridded data and irregularly located data. Sources for this data can come from numerical weather models, surface observations and other similar sources.
Free binary. OS X support is only partial/beta
Ferret is an interactive computer visualization and analysis environment designed to meet the needs of oceanographers and meteorologists analyzing large and complex gridded data sets.