Open Source
Contents
Open Source¶
Open Source Software
History of Python
Zen of Python
GitHub
PyPI, PIP, Conda
The Source Code for Python (which is actually written in C) is all available.
Open Source Software is free. (free = freedom)
Open source software is software that the user can:
use and install the software
view the source code of the project
distribute the software
change the source code of the project
Scientific research is often done in an exclusively open source environment.
Aside: Brief History of Python¶
First conceptualized by Guido van Rossum in the late 1980s:
ABC is a general-pupose programming language that influenced Python
goal: simple scripting language
basic syntax
indentation rather than cruly braces or begin-end blocks
developed certain data types (dictionary, list, strings, numbers)
he was the BDFL (benevolent dictator for life) until July 2018
Basic Timeline¶
1991: Python 0.9.0 was first released
1994: Python 1.0 (lambda, map, filter, and reduce)
2000: Python 2.0 (list comprehensions, full garbage collector, supported unicode)
2008: Python 3.0 released (print function; not backwards-compatible, fulfilling the 13th law of the Zen of Python)
For Context:¶
1842: Ada Lovelace - first published computer program
1940s: first high level programming language
1956: FORTRAN - first commercially available
1972: C
1980s: Consolidation, modules, & performance
1987: Perl
1990s: Age of the Internet
1991: Python
1993: R
1995: Java
Zen of Python¶
20 guidelines (the 20th is missing)
sometimes contradictory in favor of flexibility
# a hidden easter egg
import this
Tim Peters – the author of the Zen of Python – is a software developer and was a major contributor to the Python Programming language.
Clicker Question #1¶
Read through and think about the statements in the Zen of Python Which do you think is the most important?
The one I think is most important is in…
A) 1-5
B) 6-10
C) 11-15
D) 16-19
E) ¯\(ツ)/¯
Beautiful is better than ugly. Readability counts.¶
readability prioritized
Python prides itself on ease to work with
code is read more often than it’s written
documentation matters.
Simple is better than complex. Complex is better than complicated.¶
There’s more than one approach to solving a problem
Something that is overly-complicated should be reconsidered - a different approach? A simpler solution?
There should be one - and preferably only one - obvious way to do it¶
A slight shot at other programming languages
With more than one way, reading is harder and writing is easier. Flexibility not worth it to Pythonistas.
Namespaces are one honking great idea - let’s do more of those¶
Namespaces and global/local scopes prevent names in one module/scope from conflicting with names in another
Namespaces are for avoiding naming conflicts (not unecessary categorization - b/c flat is better than nested)
A Final Note¶
The Zen of Python is a set of guidelines for Python.
There are arguments for and against any of these.
Different languages take different approaches.
Language Wars are stupid.
Fun fact: Python is named after the British TV show Monty Python (not the snake).
Licenses¶
Open Source projects still have a license:
sometimes it says: go do whatever you want with this
other times it says: if you’re not a company/going to make money, you can use this
and other times it says: you can use it but you have to attribute the original developers
Open Source Developers¶
Many are community-driven:
open, public, & collaborative
pandas
was intially written by Wes McKinney
https://soundcloud.com/dataframed/data-science-tool-building
How open source packages are maintained and updated:
support from foundations
grant funding
out of the kindness of individual’s or the community’s heart(s)
Software Management & Distribution¶
Jupyter
GitHub
PyPI
PIP
Conda
Aside: Project Jupyter¶
Spun off from IPython in 2014 (Fernando Perez)
originally called “IPython Notebooks” (this is where .ipynb comes from)
intended for use by those who program in Julia, Python, and R
developed to support those who use interactive computing
This document (.ipynb) IS a JSON document
{‘key’ : ‘value’}
nested & hierarchical
operating unit is the cell (Markdown, code, LaTeX, …)
IPython¶
Interactive Python
Initial release: 2001
Command Shell for interactive computing in the shell
1991: Python 0.9.0 was first released
1994: Python 1.0 (lambda, map, filter, and reduce)
2000: Python 2.0 (list comprehensions, full garbage collector, supported unicode)
2001: IPython released
2008: Python 3.0 released (print function; not backwards-compatible, fulfilling the 13th law of the Zen of Python)
2014: Jupyter notebooks spun off
Clicker Question #2¶
Your final project is a chatbot that will discuss sports with you. How would you implement this?
A) Jupyter Notebook only
B) module only
C) module + Jupyter Notebook
D) IPython
E) Python in shell (command line)
Clicker Question #3¶
Your final project is a basic data analysis where you read a file in, create a few plots, and do a basic statistical analysis. How would you implement this?
A) Jupyter Notebook only
B) module only
C) module + Jupyter Notebook
D) IPython
E) Python in shell (command line)
GitHub¶
Similar to Dropbox or Google Drive…but for code.
Benefits of GitHub¶
Share code
Work collaboratively
See others’ code
Contribute to others’ code
But…what if you just want to use what’s available to you in Python…
PyPI: Python Package Index¶
Where packages are hosted. You can search and use packages from Python.
PIP: Python Installer¶
It searches PyPi for you and allows you to install new python packages from the terminal.
The format for this is:
pip install package_name
On Datahub, you have to use the --user
flag
pip install --user package_name
# for example
!pip install neruodsp
Dependencies¶
A Note on Conda¶
pip
manages single installs. Conda
manages the whole system / environment and interactions between packages.