New Class: Search and the New Economy
Next semester, I will be teaching an MBA class with the title "Search
and the New Economy," and I will be also participating in the
undergraduate version of the class, taught by Norm White. The intended
audience for the class are MBA students, that have interest in
technology but are not necessarily programmers.
I have been thinking a lot on how to organize such a class, so that it
has some internal structure and flow. My current list of topics:
1. Search Engine Marketing: Introduction, Search Basics: Crawling,
Indexing, Ranking, Pagerank, Spam, TrustRank
2. Search Engine Marketing: Analyzing and Understanding Users�s
Behavior, Web Analytics
3. Search Engine Marketing: Search Engine Optimization
4. Search Engine Marketing: AdWords, AdSense, Click Fraud
5. Social Search and Collective Intelligence: Blog Analysis and
Aggregation, Network Analysis, Opinion Mining
6. Social Search and Collective Intelligence: Recommender Systems,
Reputation Systems
7. Social Search and Collective Intelligence: Prediction Markets
8. Social Search and Collective Intelligence: Wikis and Collaborative
Production
9. Ownership of Electronic Data: Privacy on the Web
10. Ownership of Electronic Data: Intellectual Property issues on the
Web
11. Ownership of Electronic Data: The Future of Privacy and
Intellectual Property
12. Future Directions and Wrapping-up
Some rough sketches of the assignments for this course:
* Run and optimize an online advertising campaign, using Google
AdWords or Microsoft adCenter.
* Analyze the visitorship data of an online website to analyze the
effectiveness of different pages. You can use Google Analytics, or
tools like CrazyEgg
* Optimize the keyword campaign of a company by choosing the
appropriate keywords and bid amounts, depending on the competition
and the rank of the organic pages.
* Analyze (or build) a recommender system for movies, books, and TV
Shows using Facebook data.
* Build a dating recommendation system using Facebook data
* Build prediction markets at Inkling Markets, for an event of
interest, examine the accuracy of the predictions, and analyze the
behavior of the participants. Alternatively, analyze real-money
prediction markets at InTrade and BetFair and examine the effect
of real-life events in political campaigns.
* Use Google Trends to build a predictor of unemployment measures.
Any more topics what would be worth covering? Alternative exercises?
Posted by Panos Ipeirotis at 12:20 AM
1 comments Links to this post
Friday, November 9, 2007
Only for Database Geeks, the SeQueL
http://www.qwantz.com/archive/000153.html
Thanks to my students, Cissy and Shelley, for the pointer :-)
Posted by Panos Ipeirotis at 1:50 PM
1 comments Links to this post
Wednesday, November 7, 2007
What is Wrong with the ACM Typesetting Process?
Recently, I had to go through the process of preparing the
camera-ready version for two ACM TODS papers. I am not sure what
exactly is the problem but the whole typesetting process at ACM seems
to be highly problematic.
My own pet peeves:
Pet peeve A: The copyeditors do not know how to typeset math and they
do not even check the paper to see if they have incorporated correctly
their own edits.
I detected problems repeatedly and the copyeditor consistently does
not check the proofs after making the edits. Here are a few examples.
Example #1
I submit the latex sources and the PDF, with the following equation:
The copyeditor does not like the superscripted e^{\beta x_a}, so
decides to convert it into the inline form exp(\beta x_a). Not a bad
idea! Look, though, what I get back instead:
To make things worse, such errors were pervasive and appeared in many
equations in the paper. I asked the copyeditor to fix these errors and
send me back the paper after the mistakes are fixed, so that I can
check it again. I get reassured that I will be able to inspect the
galley proofs again before they go to print. Well, why would I expect
that someone who does such mistakes will be diligent enough to let me
inspect again the paper...
A couple of weeks later, and despite all the promises, I get an email
indicating that my paper was published and is available online. I
check the ACM Digital Library, and I see my paper online, with the
following formula:
OK, so we managed to get an interesting hybrid :-). Seriously, do the
ACM copyeditors even LOOK at what they are doing? If they do check and
they do not understand that this is an error, why do we even have
copyeditors?
Example #2
I assumed that the previous snafu was just an exception. Well, never
say never. A couple of days back, I got the galleys for another TODS
paper, due to be published in the next few days. Again, the copyeditor
decided to make (minor) changes in the equations. In my originally
submitted paper, I had the following equations:
In the galleys, the same equations look like:
I will repeat myself: do the ACM copyeditors even LOOK at what they
are doing? If they do check and they do not understand that this is an
error, why do we even have copyeditors?
Pet peeve B: Converting vectorized figures into bitmaps
If you have submitted a paper to a conference, you know how crazy the
copyeditors get about getting PDFs with only Type 1 fonts, vectorized,
not-bitmapped, and so on. This is a good thing, as the resulting PDFs
contain only scalable, vector-based fonts that look nice both on
screen and on paper.
For the same reason, I also prepare nice, vectorized figures for my
papers, so that they look nice both on screen and on paper. However,
for some reason, the copyeditors at ACM they seem to like to convert
the vectorized images into horrible, ugly bitmaps that do not scale
and look awful. Here is an example of a figure in the original PDF:
Here is how the same figure looks at the PDF that I received as a
galley:
Am I too picky? Is it bad that I want my papers to look good?
End of pet peeves
(Note: The same copyediting process, described above, at IEEE seems to
work perfectly fine.)
I start believing that the whole idea of publishing is a horribly
outdated process. I assumed that copyeditors were a part of a chain
that adds value to the paper, not a part that subtracts value.
If I need to check carefully my paper, being afraid that the
copyeditor will introduce bugs, that the copyeditor will make
everything look horrible, then why do we even have copyeditors? Just
get rid of them; they are simply parasites in the whole process! Can
you imagine having a professor that teaches a class and at the end the
students know less about a topic? Would you keep this professor
teaching?
Make everything open access. Let every author be responsible for the
way that the paper looks. Let the authors revise papers in digital
libraries that have problems. Why we consider perfectly acceptable to
have bug fixes and new versions for applications and operating
systems, but we want the papers that we produce to be frozen in time
and completely static?
Furthermore, the whole motivation for having journals is to have the
peer-reviewing process that guarantees that the "published" paper is
better than the submitted one. Everything else is secondary. Why keep
in the chain processes that only cause problems?
When are we going to realize that the publication system should be
completely revamped? Why not having an ongoing reviewing process,
improving the paper continuously? Should we keep the system as-is so
that we can be "objectively evaluated" by counting static papers that
are produced once and never visited again?
No comments:
Post a Comment