My python scratchpad (It will be updated regularly)

Parsing arbitrary date from String:

from dateutil import parser  
dt = parser.parse("Aug 28 1999 12:00AM")  

Source: Stackoverflow

Going up one or more directory in python:

import os  
stanford_ner_path = os.path.abspath(os.path.join(os.path.dirname( __file__ ), '../..', 'stanford_ner_classifiers'))  

Source: Stackoverflow

Checking if a string consists only of whitespaces:

if mystring and mystring.strip():  
    print "not blank string"
else:  
    print "blank string"

Source: Stackoverflow

Reversing a list or a string

st = "Python"  
st[::-1]  

To sort a dictionary by its values and return only the list of the sorted keys:

sorted(dict1, key=dict1.get, reverse = True)  

This will return you a list of the keys which are sorted by the values.
Source: Stackoverflow

Sort a Python dictionary by value:

>>> import operator
>>> x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
>>> sorted_x = sorted(x.items(), key=operator.itemgetter(1))
>>> sorted_x
[(0, 0), (2, 1), (1, 2), (4, 3), (3, 4)]

This will return you a list of the tuples which are sorted by the values.
Source: Stackoverflow

To remove all the empty spaces and newlines in a text:

>>>import re
>>>text = "You did a good    job!! \n Now go away          ..."
>>>re.sub('[\n ]+',' ',text)
'You did a good job!! Now go away ...'  

You always need an "else" condition for an "if" inside a lambda function for a map operation in python:

A lambda, like any function, must have a return value.

>>> map(lambda x: x if (x<3), [1,2,3,4])
SyntaxError: invalid syntax  

does not work because it does not specify what to return if not x<3. By default functions return None, so one could do

>>> map(lambda x: x if (x<3) else None, [1,2,3,4])
[1, 2, None, None]

Source: Stackoverflow

Getting the sorted indices of a list

>>> myList = [1, 2, 3, 100, 5]
>>> [i[0] for i in sorted(enumerate(myList), key=lambda x:x[1])]
[0, 1, 2, 4, 3]

Source: Stackoverflow

Getting individual count of all the items in a list

from collections import Counter  
>>> myList = [1, 2, 3, 4, 4, 4]
>>> Counter(myList)
Counter({4: 3, 1: 1, 2: 1, 3: 1})  

Source: Stackoverflow

Initializing a numpy matrix with somethin other than zero or one

>>> import numpy as np
>>> a = numpy.empty((3,3,))
>>> a[:] = numpy.NAN
>>> a
array([[ NaN,  NaN,  NaN],  
       [ NaN,  NaN,  NaN],
       [ NaN,  NaN,  NaN]])

Source: Stackoverflow

Testing if the matrix is a symmetric

>>> import numpy as np
>>> arr = np.empty((3,3,))
>>> (arr.transpose() == arr).all()
True  

Source: Stackoverflow

If you use two different list to index a matrix and update the value then beware of duplicates

>>> a = np.matrix(0,0,0,0],
    [0,0,0,0],
    [0,0,0,0],
    [0,0,0,0]])
>>> print a
[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]
>>> a[[0,0,0],[1,2,3]] += 1
>>> print a
[[0 1 1 1]
 [0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]
>>> a[[0,0,0,0],[1,2,3,3]] += 1
>>> print a
[[0 2 2 2]
 [0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

In the final update of the matrix, the value for a[0,3] should have been 3, but this operation does not allow repeat.

Saving a python dictionary as json file:

with open("betweenness_unweighted.json", 'w') as f:  
    json.dump(bn, f)
f.close()  

Loading a python dictionary from a json file:

with open("betweenness_unweighted.json", 'r') as f:  
    bn = json.load(f)
f.close()  

Run shell commands like grep and move the outputs to file:

user_id = '****************2ba0'  
path = '**/development/grep_output/'  
output_file_name = path+'relationship_'+user_id+'.csv'  
f = open(output_file_name,'w')  
search_directory = '**/dataset/**/user_relationship/'  
os.chdir(search_directory)

## Rather than search in a directory, I am searching
## in the current working directory using '.', which
## shortens the filenames I found
grep_process = call(['grep', '-R',user_id, '.'], stdout = f)  
f.close()  
Show Comments