30
Cultural Insight Into Slashdot from Slashdot, Part 4
0 Comments | Posted by outtatime in Uncategorized
This time it’s the old-spice guy in a form unlike that which I could have ever imagined possible:
It’d need some way to determine how your eyes are focused though – whether you are intending to look at your hud or something distant. Hold up an object up to your eye about where your glasses would rest. Close the other eye that won’t see the object. Look at the object, then look at the wall behind it.
Now look back at the object. Sadly, it isn’t your eye. But if it had a fine enough resolution it could be compatible with your eye. Look down, back up. Where are you? You’re on Slashdot looking for the display your display could look like. What’s in your hand? Back at me. I have it, it’s the iPhone 5 with a display so fine you can’t tell the difference. Look again. The iPhone display is now a projector. Anything is possible when your device is made from nanoresonators and not a retina display. I’m modded up.
11
Fun with Regular Expressions: Scrubbing the Hungarian
0 Comments | Posted by outtatime in Uncategorized
Found myself sort of longing to un-hungarianize some PHP classes. Here is more information regarding Hungarian Notation.
First off, vim rocks.
Here is the regex to (pretty safely and reliably) de-hungarianify a PHP class:
(Each group is colored for clarity in identifying which section is which in the explanation)
%s/\$[a-z]\([A-Z]\)\([a-zA-Z]\+\)/$\l\1\2/ge | %s/\(private\|protected\|public\|var\)\s\+\(\$_\?\)[a-z]\([A-Z][a-zA-Z]\+\)/\1 \2\l\3/ge | %s/\(\$this->_\?\)[a-z]\([A-Z][a-zA-Z]\+\)\([^(]\)/\1\l\2\3/ge
Explanation:
%s/\$[a-z]\([A-Z]\)\([a-zA-Z]\+\)/$\l\1\2/ge
Does simple variable name transformation of $bSomeFoo into $someFoo. This also removes the hungarian notation from statically referenced variables.
%s/\(private\|protected\|public\|var\)\s\+\(\$_\?\)[a-z]\([A-Z][a-zA-Z]\+\)/\1 \2\l\3/ge
Strips the hungarian notation from class member variable declaractions, e.g. private $_bSomeOtherFoo into private $_someOtherFoo.
%s/\(\$this->_\?\)[a-z]\([A-Z][a-zA-Z]\+\)\([^(]\)/\1\l\2\3/ge
Strips the hungarian notation from class member variable references, e.g. $this->_bSomeOtherFoo into $this->_someOtherFoo.
Keep in mind that this could still break stuff if public class member variables are referenced externally in another file, or if the file you run this on references hungarian-notation in other classes that you haven’t updated.
The Republic of Facebook – http://www.visualeconomics.com/the-republic-of-facebook_2010-06-29/
The graphic is pretty cool! There are some pretty crazy numbers being thrown down too..
Googling for “python urldecode” led me to this, which contained a solution:
import re
def htc(m):
return chr(int(m.group(1),16))
def urldecode(url):
rex = re.compile('%([0-9a-hA-H][0-9a-hA-H])',re.M)
return rex.sub(htc,url)
However, it is seems a overly complex and confusing when it could instead be equivalently represented like this:
import re
_ud = re.compile('%([0-9a-hA-H]{2})', re.MULTILINE)
urldecode = lambda x: _ud.sub(lambda m: chr(int(m.group(1), 16)), x)
Surely you agree ;)
Django + Oracle = one error after another.
It took 3 hours just to get the Oracle DB driver installed and working (ended up linking all the Oracle xxx.so libs into /lib64/xxx.so, and that worked.
Then I created my models:
class Customer(models.Model):
id = models.IntegerField(primary_key=True)
name = models.CharField(max_length=100)
phone_number = models.CharField(max_length=40)
address = models.CharField(max_length=100)
city = models.CharField(max_length=50)
state = models.CharField(max_length=2)
zipcode = models.CharField(max_length=10)
date_created = models.DateField(auto_now_add=True)
financial_info = models.CharField(max_length=255)
def __unicode__(self):
return self.name
class Meta:
db_table = 'CUSTOMER'
db_tablespace = 'tables'
When I went into the AdminSite to create a customer, everything seemed to be working well. I filled out the form, pressed “Save”, and then was greeted with a lovely error page:
DatabaseError at /admin/jjauto/customer/add/
ORA-00904: "CUSTOMER"."ID": invalid identifier
...
Googling this term + django, I found that there was a ticket which had the same error. At the bottom it said a fix had been integrated with the latest SVN version.
I figured out how to run different versions of Django concurrently so as not to break my other sites running fine with Django 1.1.1. After all this, I still got the same error when creating a new customer.
At this point I started digging again and discovered the SQL query that is causing the error:
SELECT * FROM (SELECT ROWNUM AS "_RN", "_SUB".* FROM (SELECT "EMPLOYEE"."ID", "EMPLOYEE"."NAME", "EMPLOYEE"."PHONE_NUMBER", "EMPLOYEE"."ADDRESS", "EMPLOYEE"."CITY", "EMPLOYEE"."STATE", "EMPLOYEE"."ZIPCODE", "EMPLOYEE"."DATE_CREATED" FROM "EMPLOYEE" WHERE "EMPLOYEE"."ID" = :arg0 ) "_SUB" WHERE ROWNUM <= 21) WHERE "_RN" > 0 (,)
So I took the query and modified it to use lower-case column names:
SELECT * FROM (SELECT ROWNUM AS "_RN", "_SUB".* FROM (SELECT "EMPLOYEE"."id", "EMPLOYEE"."name", "EMPLOYEE"."phone_number", "EMPLOYEE"."address", "EMPLOYEE"."city", "EMPLOYEE"."state", "EMPLOYEE"."zipcode", "EMPLOYEE"."date_created" FROM "EMPLOYEE" WHERE "EMPLOYEE"."id" = 1 ) "_SUB" WHERE ROWNUM <= 21) WHERE "_RN" > 0;
I ran it in SQL Developer and it executed fine, returning no hits.
It is becoming apparent that the problem is Django taking each column (e.g. “id”), and creating the table with the column name (e.g. “id”, lower-cased), but then when it runs this query, the column names become upper-case (i.e. “ID”). Ok, that is dumb and should probably not be happening, but I’ll try and work with what I’ve got here. I found the option to override the column names, and used a little vimfu to make it painless:
%s/^\(\s\+\)\([^ ]\+\)\( = .*\))$/\1\2\3, db_column='\U\2')/ge | %s/(, /(/ge yielding: class Customer(models.Model): id = models.IntegerField(primary_key=True, db_column='ID') name = models.CharField(max_length=100, db_column='NAME') phone_number = models.CharField(max_length=40, db_column='PHONE_NUMBER') address = models.CharField(max_length=100, db_column='ADDRESS') city = models.CharField(max_length=50, db_column='CITY') state = models.CharField(max_length=2, db_column='STATE') zipcode = models.CharField(max_length=10, db_column='ZIPCODE') date_created = models.DateField(auto_now_add=True, db_column='DATE_CREATED') financial_info = models.CharField(max_length=255, db_column='FINANCIAL_INFO') def __unicode__(self): return self.name class Meta: db_table = 'CUSTOMER'
Finally, the inserts work :) What a PITA…
3
Learning to do natural language processing with NLTK
0 Comments | Posted by outtatime in Uncategorized
There are a few resources that I have found to be very helpful in learning to use NLTK for NLP.
Jacob Perkins has written several FANTASTIC articles with examples which cover the basics of how to use NLTK (main website: Streamhacker.com):
- Part of Speech Tagging with NLTK – Part 1, Part 2, Part 3
- How to Train an NLTK Chunker
- NLTK Classifier Based Chunker Accuracy
- Chunk Extraction with NLTK
Processing Corpora with Python and the Natural Language Toolkit
nltk “constraint grammar rules”
http://www.cs.bgu.ac.il/~elhadad/nlp09/hw2.html
http://cs.nyu.edu/courses/spring04/G22.2591-001/lecture3.html
http://nltk.googlecode.com/svn/trunk/doc/book/ch07.html
http://www.csie.ntu.edu.tw/~cjlin/libsvm/
http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#svmguide1
http://www.cnts.ua.ac.be/conll2000/chunking/
8
Cultural Insight Into Slashdot from Slashdot, Part 3
0 Comments | Posted by outtatime in Uncategorized
Comments found in article “Statistical Analysis of U of Chicago Graffiti“:
Re:License? (Score:3, Interesting)
Well, what did you expect, from a mindset is not attached to physical reality?
That it would make any sense at all?
The wall with the graffiti is a physical object.
A paper photo in your hand would be a physical object.
But neither the graffiti itself, nor a photo of it, are physical works.
They are ideas/information. Other rules apply.
“Licensing”/“copyright“ is a concept, based on the misconception that ideas/information would be physical objects, and the false need of some people, to control that information.
Trying to argue with it, using logic, is (because of that false base assumption) by definition impossible.
The real physical rules for information are: If it’s out there, it’s out. Period.
So you either never give it out, and won’t be able to prove that it exists at all. Or you give it out to your chosen group.
Which can for example be people that you trust. Or, as in this case, everybody.
In case you gave it to everybody who wants it… well, you should have thought earlier about that everybody could store and copy it at will. (Just like looking at the physical wall and then telling someone, or drawing it from memory, is storing and copying.)
It does not matter if people want to accept that. Just as it does not matter if people want to accept gravity.
You can try to enforce weird rules of behavior onto people, trough mental tricks of psychology. And it may be easier to do in this case, than it is for gravity. But in the end it’s futile. Because you can’t control the whole world. Even with ACTA.
If nothing else, you will end up banning the ability to look at it, because some people became really good at memorizing and reproducing it later. And everybody who can’t remember it, will by definition not remember that it existed.
–
Whenever anyone mentions “facts”, you know he tries to shove his dogmas down your throat.
Wish I had mod points (Score:3, Interesting)
This may be the BEST counterargument ever to “all information should be free”. Bravo!
However, while I genuinely want to mod you up, I do believe that CURRENT laws to control information are stupid. Similar to how laws can sometimes be unfairly and maliciously used to allow known murderers to remain innocent and walk freely, many patents and copyrights are unfairly and maliciously used to prevent people from contributing to the greater good of humanity. Patents in particular are a minefield — something’s clearly wrong with a system that encourages trolls to cripple the true innovators.
Back to the topic, I believe what the researcher did, copyrighting her photographs, is all right, regardless of whether she released it under Creative Commons. I don’t believe she was copyrighting the actual message on the graffiti anyway, just the expression of it on photograph. (Of course properly the copyright should be attributed to both HER and whoever made the graffiti, but then I would suppose THAT’s public domain since the original author didn’t stake a claim to it…)
–
Pet peeve: Profane people propagating perfunctory pedantry.
25
Quick wordpress wp-o-matic duplicate post disable fix
0 Comments | Posted by outtatime in Uncategorized
The hardhack solution to effectively permanently disable duplicate post titles from being created in in wordpress doesn’t get any easier than this:
http://linuxil.wordpress.com/2008/02/24/wp-o-matic-quick-dirty-duplicate-post-fix/
Thanks to linuxil!
12
Getting Wepbuster to work with BackTrack 4 Pre Release
0 Comments | Posted by outtatime in Uncategorized
So I recently found wepbuster, and have been trying to get it working with the Backtrack 4 Pre Release (or is it Pre-Final? ..).
BT4 doesn’t seem to come with all the right perl modules installed. These errors were easily “fixed” by simply installing the requested packages.
root@bt4-pre-release:~/src/wepbuster-1.0_beta$ ./wepbuster
Can’t locate Number/Range.pm in @INC (@INC contains: /etc/perl /usr/local/lib/pe rl/5.10.0 /usr/local/share/perl/5.10.0 /usr/lib/perl5 /usr/share/perl5 /usr/lib/ perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at ./wepbuster line 8 .
BEGIN failed–compilation aborted at ./wepbuster line 8.
root@bt4-pre-release:~/src/wepbuster-1.0_beta$ ./wepbuster
Can’t locate Algorithm/Permute.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.10.0 /usr/local/share/perl/5.10.0 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at ./wepbuster line 9.
BEGIN failed–compilation aborted at ./wepbuster line 9.
Wepbuster thread over at remote-exploit.org forums
6
Python Error/Exception: IOError: [Errno 32] Broken pipe
0 Comments | Posted by outtatime in Uncategorized
So, I got this weird exception when I ran the command:
python script_that_prints_output.py | head
..[some output, in fact, just the amount that head wanted(!)]..
Traceback (most recent call last):
File "./script_that_prints_output.py", line 86, in
print out
IOError: [Errno 32] Broken pipe
I was not sure why this was happening, because when I ran the command without piping it to “head”, it ran just fine, displaying the expected output with no exceptions or errors raised.
I found the explanation for this behavior, and thought it was interesting enough to warrant a post (that way if I encounter it again at some point in the future, the odds that I’ll remember are probably greater ;):
> IOError: [Errno 32] Broken pipe
>
> Anyone know why this is happening?
That’s normal, at least with Unix. When the program on the receiving end
of a pipe decides to close its end for some reason, Unix sends the signal
‘SIGPIPE’ to the sending end. Python catches this and turns it into an
IOError exception. The only way around this (that I can think of) is to
catch the exception and exit the program gracefully. If you try to send
more data, you will get more IOErrors, since your program has nowhere left
to send data.
–
I actually thought of anohter way around it, albeit not efficient in any way (the whole point of the SIGPIPE signal is to stop the program from completing, since no further output is required from it). This is definitely a rather ugly hack, but sometimes that is preferable to adding the exception handling, I guess.
python script_that_prints_output.py > tmpfile ; head tmpfile; rm tmpfile
Update 2009-12-17: Useful information about broken pipes in Python at StackOverflow