-
Website
http://avc.com/ -
Original page
http://www.avc.com/a_vc/2009/08/scanning-headlines.html -
Subscribe
All Comments -
Community
-
Top Commenters
-
ShanaC
1228 comments · 73 points
-
daryn
213 comments · 14 points
-
kidmercury
829 comments · 104 points
-
howardlindzon
207 comments · 71 points
-
Charlie Crystle
205 comments · 35 points
-
-
Popular Threads
-
Thoughts on Blackberry Fail
12 hours ago · 66 comments
-
Getting Computer Science Into Middle School
2 days ago · 267 comments
-
End of Year Music Posts
1 day ago · 46 comments
-
How To Get Me To Hang Up On You
4 days ago · 158 comments
-
Open APIs and Open Standards
5 days ago · 207 comments
-
Thoughts on Blackberry Fail
The ideal aggregator should be a management tool. Instead it makes life feel more bloated with information, some useless some not....gah
I would kill (ie pay for, and yes I mean that seriously) a very fine tuned, socially involved, aggregator + search which is additive. I feel like I am never going to get consistently anywhere, or even a guessit-mate of what I want, if the process is not additive in a much better sense, and if it doesn't guess over time...My aggregators can never tell me when I need to get another blog, or news headline, that will be important to what I am doing in my life. Or when it should be disrupted by say, the United States sending a Senator to Myanmar...It's frustrating..in theory something like twitter should cover all bases, in reality, this never works because it moves too fast, and you don't want to be attached to your tweets.
Further, something that always shocked me about aggregators as they stand today- they don't all manage the push outward equally. With all of these social netwarking blah blah blah stuff, you would think if you can add them into your aggregator, you would want to push out from your aggregator equally too- this is not the case....So you end up having to scan, then leave. Pain in the neck.
Total neutrality and comprehensiveness has been a day1 objective for my business model (we're a customizable super-aggregation platform). The user should be able to get a true 360 view on a given topic without bias, but with social rank data.
But to have such a fine-tuned aggregator, you need to start with well defined topics (as a starting point). Then a composite of your topics can be created for you. The aggregator itself would worry about new stuff and content, not you. We have a few such topics already here: http://portal.eqentia.com.
http://www.shanacarp.com/essays/aggregators-the...
Changes aggregators from a generalist tool to a specific use tool and they will be so much more productive because computers just do not have the means to deal with our moods( unless they become moody). It will become a choice then of changing tools, not biting nails and working so hard to make a too broad tool always work (it won't let's move on)
I think we agree on the end-point, but the trick is how to get there: whether it's via pure learning from the user, or aided by a kick-start of sorts.
OTOH it seems that 'social aggregation' is faring much better than fully automatic one, most of the web communities function is 'social aggregation' of data.
bye
Andraz Tori, Zemanta
The more generic problems/questions to be solved about historical data is:
- what parts of the historical data are relevant? (this is probably one of the reasons APML hasn't been adopted so far)
- how should the system rank historical data considering the possible changes in user's interests over time?
- how should the system use historical data when the user is operating in "short term interest" mode? (looking for the last game results, comments, etc.)
bests,
./alex
It's human nature to browse around for no reason at all, which is what many people do, and no wonder that "(not so) smart" aggregation, basing suggestions only on the past, get confused. Group aggregation is kind of effective, I guess (e.g. techmeme), but it's also often not directly relevant to your every day.
To me, the perfect aggregator collects news that is relevant to my context, and when you really get down to it, that means that Google search is about as smart an aggregator as is needed.
The best example of a successful social aggregator, in my view, is Hacker News, while the worst is Digg, which has become completely useless to me as it went mainstream. Not that I think there's something wrong with "going mainstream", on the contrary, but there's something about the *way* Digg did it that let all the good out and all the bad in, and now whenever you go to Digg the only things that come to the top are lolcats.
Which is why I thought Reddit open sourcing their platform was such a great idea -- I think what you need is a sort of "Ning for News", where there would be infinite potential for niche communities to come together around a smart technical platform for aggregation. But Reddit never really pulled it off, probably because they got acquired too soon. Then you could build "meta-aggregators" around specific verticals -- tech, politics, etc.
Perhaps today the best potential for a smart social aggregator might be Twitter, especially combined with Bit.ly and other services.
Agreed about hacker news. What a great service and community they've got there
Have you seen the Hourly Press? It's an aggregator based on an authoritative social filter that aims to circumvent the Digg problem. It gets its links from twitter. (I'm involved).
http://hourlypress.com/
The first instance is News about News.
http://newsaboutnews.hourlypress.com/
1. Our interests are normally very wide (f.e. I consume content related to startups, technology, software architecture, but also a bit of sport -- but not about all sports, politics, global state, etc.). Basically that means that vertical aggregators will just be able to provide "locally" optimized content (as in math local optimum), while generic aggregators will be facing an even bigger problem: filtering through a huge amount of data.
2. Another problem faced by aggregators is if and how they balance quality over timeliness. There's content that is timely, but there's also content for which the quality matters more. To make things even more complicated: a) for some of us timeliness counts more than quality while for others is exactly the opposite; b) there isn't one content type of content vertical where you can say that it falls in one and only one of these types
3. Last, but not least, the way we are consuming content is mood-driven. And even if there have been attempts to quantify the mood-based consumption patterns using historical data, there's no guarantee that it will work on a daily basis (simple put: even if for the last couple of weeks, I've spent more time reading about startup funding, this doesn't mean that today my top priority wouldn't be tech topics).
4. Historically, we've been using a search-based content consumption model (and most of the time we've been worried not to lose "important" content). But due the exponential growth in content creation, I think that we will have to move towards a recommendation-based content consumption model (there's a lot more to say about this and unfortunately I don't think it will fit a single comment).
And there are quite a few more problems that makes me believe that there's no such thing as a perfect aggregator.
Anyways, I'm one of those that have always believed in what Dave Winer formulated so well: "the fundamental law of the Internet seems to be the more you send them away the more they come back" (and I'll continue to work and experiment for improving the online content consumption experience).
This is an interesting idea that I'd like to think more about. Would you mind giving a couple of such "task" examples?
In case such a task is: "find out the details of the FriendFeed acquisition" I think this will remain in the search field (or if it is a longer term "task" then alerts are probably the way to go).
./alex
I wrote about it in the case of Blip.TV, I think their aggregator solves one problem for their user base- brand management. However that aggregator assumes a high level of text and interaction. Its goal is to easily create communities for tv show creators where their consumers hang out. Not every aggregator needs to have that goal.
The headline scanning numbers are certainly interesting, but they are only part of the story. It's sort of like my admonition to people concerned with traffic to their blog. Quantity is only part of the story. You might only have 3 readers, but if their names are Bill Gates, Warren Buffett, and Barack Obama -- and they like what you have to say -- it doesn't matter. Same with site visitors. Raw numbers only tell part of the story. The revenue those readers help to generate is really the important thing in assessing the business.
I think there's an emerging battle brewing in aggregation and curation models.
http://www.techcrunch.com/2009/08/16/the-media-...
Regardless. I contend that most ... 70% of us??? ... walk by a coffeeshop or book store that carries the nytimes within 2 hours of seeing a tweet. I know I personally spend 6 hours a day staring at the nyt rack from the hard wooden chairs of the santa monica or spring st. Soho starbux.
Oddly, I never Ever see a tweet from the times that makes me walk five feet and spend five dollars. The papers don't have to create an agregation network. They just need to learn how to use twitter.
How about a dynamic model where aggregators pay a share of revenue generated to original content generators? Too hard to implement? Just use DNS to send paypal daily to site owners. The challenge is identifying the originating owner of the content. Associate a unique time stamp and encrypted ID watermark equivelant into any content.
I know Fred gets very frightened when I say this; the market is going to pay for community, pay to comment, pay for verification. We want to know we have the authentic. I know that here, at least for a very long time, it will be free- content in general apparently is something created by a community of people. And the overloards of the brand want to control it, they are going to have to make people pay to have their name hooked onto the brand, just like I have to pay for a pair of Bensimon sneakers (note: I do not own bensimon sneakers, just think they are hot...)
Argreggators are part of that economy, in that they control the lens in which we see what brands first, and how we rank them, especially as aggregators technically get better (I suppose they are going through the awkward growth spurt period now). Whether the majority of us realize it or not, I believe this is one of the top ranking VC blogs, because of the power of branding. Content in these comments and in the blog itself could be seen as more valuable than in the lower ranks...
Thanks for the counter view William.
I was trying to follow the numbers and assumed I wasn't following the teminology. Thanks Pascal.
Yeah, the trend is right, in reality he would probably lose more visits over
time.
People overemphasizes traffic and expectations of what sites should be bringing in are really out of whack. It's not in line with what's proven to be true in content and platform business over the course of time.
I looked into this a while ago, and of all the people who visited the NY Times in that month, only about 20% had visited the front page even once during that month. For the Washington Post, that number was even lower -- at 14%.
http://blog.agrawals.org/2007/04/09/who-reads-t...
And those are two of the biggest global news brands. I suspect that the numbers are even worse today.
The NY Times had a piece on this a few years ago called "This Boring Headline Is Written For Google"
http://www.nytimes.com/2006/04/09/weekinreview/...
The optimal headline from an SEO perspective would be generic, loaded with keywords. Cuteness, wordplay, allusions don't go very far.
But it's just the reverse for social media. There, you want to write something that clicks with a human.
Matt Cutts may have some insight on this from the google sandbox, he said they're doing some pretty major overhauls to shift the big G close to real time.
Aggregators like Techmeme help me filter out the noise. Keep the noise off your site, and I will be your direct traffic any day.
If the front page of NYTimes.com linked to everything interesting on the web instead of just their own stories, they could play the same game. I understand the organization reluctance to do that, but I wonder if they have any other choice.
This crystallized some thoughts I've had on how production and distribution of news are destined split. I believe the news business will begin to resemble the ecosystem of the motion picture business.
If the news business is to survive, production and distribution must be decoupled. The strong national brands will likely be able to play on both sides, just as the major motion picture studios do, but even there the production and distribution divisions will need to be autonomous.
Meanwhile, we're going to see new business models in which newsgathering and production can be done efficiently and profitably, and distributors will figure out how to make money while providing the necessary economic incentive for the producers.
News producers who are opening APIs and experimenting with a variety of syndication business models are likely to lead the way. Those that aren't are sitting on the sidelines and hoping that when the innovators "crack the code" they'll still be able to jump in and join the party. By then, it will likely be too late.
More here:
http://www.praxicom.com/2009/08/production-and-...
-G.
A 'link' requires a click-through, and click-throughs are slow. It reminds me of Fred's post about streaming overtaking P2P file-sharing -- it's getting faster to stream. Similarly, 'no click delivery' of well-targeted content may compel users to pay for (full-length) content.
APIs are a faster and more measurable way to deliver full-length content to licensed aggregators. If you know anyone that pays for 'free' blog content via their Kindle, that is a statement that speed (no clicks) and context (e-reader delivery) matter more than freely-available links. If you consider the forthcoming number of 'end points for reading' (slew of e-readers, netbooks, iTablet, PC-less printers and smart phones), device-sensitive targeting (sans ads) is important for aggregation.
For skimmers, scanners, grazers and snackers (online), the link-based economy may be fine and worth the time that click-throughs require. But RSS/Digg/Techmeme requires two clicks from headline to full-length content, with pages littered with ads and additional page clicks.
Licensing by aggregators that pay content producers a direct share of value dervied from READING (not click-throughs) strikes me as a brighter future, to omichels' point. It will also avoid a future where a so-called 'wadget' (puke-in-mouth) is used to monetize aggregation, as suggested at the bottom of Mishkin's post on paidcontent.
At the end of the day, of course we all want that perfect newspaper for ourselves, but how do we figure out who is reading at a given site? We all want to be direct, but we built the web in such a way that isn't by nature direct- it is indirect.
If they linked more broadly, I think it would only help.
I hope a few papers figure out how to run in the internet age before they all go bankrupt...
We micro-aggregate within our field (book publishing professionals) and slice/curate sub-sections of aggregations.
But at the top level, we select the most relevant/interesting stories from hundreds of feeds, and rewrite the heads to convey as much information as possible in the link line. In an information-crowded world, helping communicate bullet points of interest but saving you reading the whole story is quite valuable. The better we do that, the less our readers need or want to click-through for the full story. (Most people tend toward the opposite; a provocative/teaser head that's designed to elicit ctrs.)
Surely the future is not listed headlines RSS style, but something that clearly communicates what we should prioritize and why we should read in a format that lives beautifully across all of my devices. Currently, it seems that most solutions are focused purely on content.
People have varied and diverse interests and trying to home in on what you want to read is all the more challenging since your capacity for content on some topics is a breadth play (news on random events around the world, maybe events in your city) or a content play (things that interest you, maybe formula one, soccer news on your club, cooking etc.). The holy grail of this is a bit like stumbleupon paired with your interest profile from facebook/delicious or otherwise).
Do you want something/one/site to actually have you figured out? does that change your reading behavior just to push the limits perhaps and slowly change your very approach too? Sometimes the most interesting things are way beyond the scope of your "normal" reading!
SR
Whilst we may think (through rose-tinted spectacles) that we 'read' the entire newspaper in fact what we typically did (or 'do' if still a patron of hard copy news, like myself) is only drill-down into a small percentage of articles. From a headline we assimilate very quickly what stimulates us and so what warrants our further time to read in more detail.
Only a given number of stories at a given time can truly engage us - depending on our frame of mind at the given time, attention-span available, external influences, motives, etc.
I'm happy to sacrifice some time to skim a given % of headlines knowing this is part of the process of identifying/filtering what's really of interest to me. If all my headlines were so finely tuned to me I drilled-down 100% of the time I'd be a tad concerned that I was missing out on other news/articles that required me to decide on whether they interest me or not.
With 'perfect' headline tuning resulting, there is also the risk of losing that greatest of delights - serendipity.
The more we aggregate, the more headlines we have to scan and decide upon, and all in the blink of an eye. If an app can (over time) 'learn' from what the reader is actually drilling-down into it can begin to fine-tune the results on any given topic - as we are trying to do with ensembli.
But, in my opinion, we must never lose sight of the fact we're talking about words, not data - the tuning of words/their nuances and the discovery of news/articles is something we should take delight in and not be frustrated by.
Interestingly, they already do "Headlines From Around the Web" with their Tech Update email. For example, today's update had links to Techcrunch, Engadget, and TUAW:
http://www.nytimes.com/indexes/2009/08/17/techn...
It's not quite what you're suggesting, but it's a step in the right direction.
I love that the Times has so many experiments out there and I don't think it gets enough credit for trying most of the time. But, I also fear the way many of these experiments are released - by tip-toeing around the main site - will negatively affect the results it sees and what it learns from each one.