Thursday, 25 November 2010

regex review

Find the Amazon UK page listing Digital Cameras. Use View Source in your browser to examine the html for this page. How are the prices tagged?

Explain briefly how XSLT could be used to extract a table of prices from this page.

On the printouts provided, circle the (non-empty) strings matched by each of the following regular expressions. You can use a separate sheet for each expression.



Can you write a regular expression that selects all the model numbers on this page and nothing else? Use the regular expression sandpit to check your answer – copy text from the web page to replace the Jabberwocky poem.

The Amazon page also has details of resolution (megapixels), zoom factor (e.g. 8x Optical) and screen size (e.g. 2.7 inch), but these are not tagged specially in the page source.

To tag them you would first have to find them. Use the regular expression sandpit to find three regular expressions – one for each data type – to find all occurrences of each of these three data types.

Friday, 12 November 2010

Assignment 3: The Tablet

Deadline: Monday 29 November, 16:00

You can submit your report via TurnItIn. See email of 26th November for signup details. Once you have registered, you can sign in at the following url:

The deadline for your submission is 1600 UTC on Monday 29th November. Unless you have good reason for not meeting this deadline, and we have agreed an extension, late work will not be marked.

Electronic submission is preferred, but if for any reason you cannot submit your work using TurnItIn, you should submit two hard-copies of your report to the ITO on the 4th floor of the Appleton Tower before the deadline.

First, download and read a document published in 1988 entitled,
"TABLET: the personal computer of the year 2000".

The abstract says,
"The University of Illinois design extends the freedom of pen and notepad with a machine that draws on the projected power of 21st century technology. Without assuming any new, major technological breakthroughs, it seeks to balance the promises of today’s growing technologies with the changing role of computers in tomorrow’s education, research, security, and commerce.

"The design is simple, yet sleek. Roughly the size and weight of a notebook, the machine has no moving parts and resembles the dark, featureless monolith from a well known movie."
For this assignment you should write a report (1,700-2,500 words), entitled "Predicting the Future", using this paper as a case study. Your report should discuss the extent to which the future of technology can be predicted 10 to 20 years ahead.
  1. Briefly summarize the key features of the personal computers available in 1988.
  2. You should examine the technological trends on which the authors base their predictions (p.28). Select three (or more) key technology metrics and report on how well their development over the past 22 years has matched the predictions of the paper. 
  3. Use the internet to identify, and report on current predictions for changes in these metrics over the coming decade.
  4. How do devices available today match the predictions made 22 years ago? Which features of the "Tablet" are now available on mass-market personal devices, and when were they introduced to the market? What, if any, current technologies did the authors fail to predict?
  5. Use Google to find out what you can of what has become of the authors, and report briefly.
  6. For the final section of your report you should take a stab at making your own predictions, based on the technology forecasts you reported in part 3, for a device that might be in common use in 10-20 years time.
Your report should be between 1,700 and 2,500 words in length. The School's standard guidelines on plagiarism apply.

Further reading

As we may think, Vannevar Bush (July 1945)

Thursday, 4 November 2010

Assignment 2 – Who owns information?

(hand-in 4pm 15 Nov)

For the second assignment, you should write a report on the way digital technologies are challenging traditional notions of privacy and ownership of information.

You can choose to focus on one of two areas
  1. One concerns issues of copyright and 'digital rights management'. How do works stored digitally differ from works embodied in media such as books or CDs? How do rights of 'fair use' fare in the face of digital rights management? Should you be able to lend items from your digital library to friends, just as you can lend embodied media? How can society best encourage creativity in the production of 'digital assets'? How can we ensure that future generations will have access to works being produced today?
  2. The second issue concerns personal data. How and why do entities such as Google and Facebook collect data? What are the personal risks and benefits of this activity? What are the risks and benefits for society? How can the risks be mitigated and the benefits enjoyed? To what extent is it desirable or practical for government to limit such activities? Is government also collecting such data – if so, why and how?
Whichever focus you choose, your report should address the following issues:
  • What is the value of the information in question? How will it be used? Who benefits financially? Who enjoys other benefits of ownership or use?
  • What is the cost of the information in question? What efforts contribute to its production, and how are they stimulated or funded?
  • What new risks and opportunities are created by the use of digital technologies to store, process and communicate this information.
  • You should also include a list of references you have consulted in the preparation of your report.
The following Wikipedia articles may be useful as background reading, but you should use the web to find original sources to inform your report.


Your report should be between two and three-thousand words in length. The School's standard guidelines on plagiarism apply.

Lab session – advanced search

Google offers an advanced search page – for details see this article or this note on search operators just search for "google advanced search tips". Google also offers many specialised searches.

Your task for this lab is to use these tools to find information on ownership of information. You will need this for the second assignment.

You should experiment with use of the boolean operators and synonyms.

As an exercise, try to find a query that returns as few documents as possible – while still returning at least one document. This is a competition, in case of a tie on number of documents, the shortest query wins. In case of a tie on that criterion also, the query using the shortest 'words' wins – and any string enclosed in quotes (") counts as one 'word'.