Entries : Category [ Hacks ]
Computer and electronics tricks and fooling around
[Miscellaneous]  [Computers and Technology]  [Travel]  [Education]  [Hacks]  [Robotics]  [Science]  [Programming and Software]  [iPhone]  [Digital TV and Video]  [Intellectual Property & Copyright]  [Personal] 

22 February
2007

This blog is based on the Zope product COREBlog. While COREBlog 1.2.5 (and prior) allow an article to be assigned to both primary and secondary when created, the secondary categories are not used in in the blog index, which seems like a clear oversight.

Sascha Welter noticed this, figured out the problem, and fixed it with a patch for an earlier version of COREBlog. That patch isn't quite right for COREBlog 1.2.5, but the trivially modified patch below seems to do the job (see the extended version of this article by clicking the 'permalink' or 'read more' links below).

Note the counts of the number of entries per category will then look wrong, since they only reflect the primary category of each article. Fixing this requires a second patch which I cooked up (see the extended version of this article for details and patches, by clicking the 'permalink' or 'Continue reading' links below).


Patch1: fixing the categories:

*** COREBlog-original.py        Thu Feb 22 19:11:05 2007
--- COREBlog.py Thu Feb 22 19:12:54 2007
***************
*** 912,918 ****
          l = []
          for id in self.entry_list:
              obj = self.getEntry(id)
!             if obj.category and obj.category[0] == category_id:
                  if not consider_moderation or (obj.moderated and obj.date_created() <= DateTime()):
                      l.append(obj)
          return l
--- 912,920 ----
          l = []
          for id in self.entry_list:
              obj = self.getEntry(id)
!             # line below modified to check for id and any of the object categories,
!             # not just the first (primary) one. GD.
!             if obj.category and category_id in obj.category:
                  if not consider_moderation or (obj.moderated and obj.date_created() <= DateTime()):
                      l.append(obj)
          return l

To fix the item counts, the second patch below also needs to be added. This correctly updates the counts for any new articles, but not for old ones. To force all the counts to be recomputed for all existing articles, you can visit the method "manage_calculateCategory" from the browser (e.g. http://YOURSITE/blog/manage_calculateCategory ) which will print out nothing, but will cause the counts to be re-estimated. You only need to do this once after installing the patch.

*** 1339,1347 ****
              cat.set_count(0)
          for id in self.entry_list:
              ent = self.getEntry(id)
!             if ent.category and self.categories.has_key(ent.category[0]):
!                 cat = self.categories[ent.category[0]]
!                 cat.set_count(cat.get_count()+1)
  
          #reset datemap
          self.datemap = IOBTree()
--- 1341,1353 ----
              cat.set_count(0)
          for id in self.entry_list:
              ent = self.getEntry(id)
!             #if ent.category and self.categories.has_key(ent.category[0]):
!             # GD. Fix to use sub-categories too.
!             if ent.category:
!                 for ecat in ent.category:
!                    if self.categories.has_key(ecat):
!                       cat = self.categories[ecat]
!                       cat.set_count(cat.get_count()+1)
  
          #reset datemap
          self.datemap = IOBTree()

By Gregory Dudek at | Leave a comment |    
15 March
2007

Background: Various groups track information flow by logging the URLs being exchanged when data is up- and down-loaded. An initiative is currently underway by the US Department of Justice, and already passed in Europe, to force internet providers (ISP) to log data transfers, and to retain these logs for a long time (despite the large amount of storage this requires). This is an ongoing effort and follows a
prior effort
in the same direction. It seems the objective is not to save the actual data, just the information linking the URLs that were used: where did you visit and what filename did you upload. Such information is already used for DMCA "take-down" requests and
the legal page
at the infamous Pirate Bay bittorrent tracker provides many (amusing) examples. It also represents a huge incursion into personal privacy which is threatening in many ways.

Utility?: While anti-terrorism is cited as one of the benefits of this plan, it seems unlikely that actual terrorists operate by uploading data to public sites. The initiative is more likely to be motivated by DMCA enforcement. Even the most trivial passwording and encryption by organzied groups (such as terrorists) gets around this measure. I suppose one still might be able to catch the a really foolish bad guy,
which sounds insignificant, but perhaps that not as irrelevant is it seems.

Related work:


Once such linkages between users and URLs are made, there are a lot of very interesting data mining possibilities. Google is surely looking at doing this right now for commercial purposes (e.g targeted advertising). This is very close to work we are doing (abstract)
(pdf) to unravel the positions of robots or sensors deployed in space. The connections one might get can be insigntful, but also very misleading at times, and this is worrisome. It means, in principle, that you could get into trouble for using certain goodle search words, without even download anything. This is akin to patrolling people's thoughts.

Circumvention:
In the case of web traffic and DMCA enforcement, however, it seems like this effort to simply log traffic can be easily circumvented or obfuscated. A current practice is to pursue people if an upload of theirs has been download "too many" times. If the data provider simply uses cryptric URL's and rotates them often, as illustrated below, then the logged data becomes almost useless. The URL doesn't tell you anything and surely doesn't prove much. (i.e. the URL for this article might be
blog/41 now, but tomorrow it becomes blog/21111). This makes permanent links tricky for the user, but many such links already lead only to index pages that provide the connections between the URL's and the description of the content they provide. Of course, one could log that too, but then it becomes much much more complicated since doing it would human intervention involves producing (essentially) a snapshot of the whole internet on a regular basis. In short, this proposal seems fraught with problems, but from a technical standpoint as well as with respect to personal privacy.



Try it: URL content changes after a few clicks.


The simple example shown here illustrates a URL (for the picture) that delivers different content at different times. Note that this is not the same as just changing the images linked into a page, because the actual URL of the image itself doesn't change, but the content it points to changes. The first time you click it you get an image of "secret" troop deployments (that might violate the DMCA). If you reload the same URL a few times, you get something more benign. Hence, knowing who accessed the URL doesn't provide any information, unless you actually store the data too (which isn't practical). (Approximate source code for above example here.)

By Gregory Dudek at | Leave a comment |    
10 March
2007

This is a list of really useful python modules.



While I really dislike the poor support for concurrency in Python (and the GIL -- the global interpreter lock -- for threads) which I think makes the language so badly flawed it may prove fatal, the rest of the language is still really great.


By Gregory Dudek at | Leave a comment |    
02 May
2007

Various sites on the internet, including this one, have been deluged by visitors attempting to visit the web page named 09-F9-11 ... -56-88-C0 even where that page does not exist! The result is errors in my web log with this code. This is a number is used to decode HD-DVD's and there is a major flurry of activity over the fact that there have been efforts using the DMCA to suppress this information. For sites in the USA, it may be illegal even to post a news article like this one, despite that fact I am only reporting on this ongoing phenomenon.

The social news aggregation site digg.com was over-run with posters putting up various articles about this number, or simply featuring this number with posts going up faster that they could be removed. This activity was provoked, in particular, because an article with this number in it was removed and this was interpreted as censorship.


By anonymous at | Leave a comment |    
07 June
2007

I have installed familiar linux (0.8.4) on an IPAQ 3900, with the Opie user interface.

I messed up the touchscreen calibration and had a very hard time fixing it.
If anybody ever has this problem (and it is reported occasionally on the net), you can get things (approximately)
working using my machine calibration data. Replace the file
called /etc/pointercal with this data:


24271 -110 -1037464 -228 17819 -808152 65536


Maybe this will help somebody, somewhere avoid the hassle I had.


By Gregory Dudek at | Leave a comment |    
06 July
2007

I am using a Canon SD430 digital camera for a project.

This is a decent 5MP camera with the ability to transfer images or do remote control image capture (or previewing) via 802.11b wireless (WiFi). It is normally expensive, but Amazon is having a clearance sale and selling the camera for $159.

The link is here [www.amazon.com] but I don't know how long it will stay available.

[Update subsequent to the original post: this price only lasted a short time and the camera is now more expensive again.]

As a web-cam this is great! It provides much better quality data than a standard web cam, and also can be connected to any wireless connection you might have. The camera works with connections in either Ad Hoc or Infrastructure mode (i.e. either and access point or a laptop directly). You do need a wired connection to MacOS or WIndows to configure it, though (too bad you can't associate to any access point without that, so you could send photos while wardriving).

img_Jul_06_2007_17_49


By Gregory Dudek at | Read (1) or Leave a comment |    
[1]   2   3   4   5   6   7   Next