Connections versus Outputs

« Robin Mason RIP | Main | Some more thoughts on metrics »

19/06/2009

Connections versus Outputs

Warning! This post may contain own trumpet blowing!

Laura Dewis sent me a report the lovely OU Online Services people had prepared in collaboration with consultants (MarketSentinel). They were interested in examining the broader influence of various web sites and looking at sentiment mining. The idea from an official communications perspective being you can see how well regarded your institution is in different sectors, and maybe influence that perception.

But from my perspective the analysis they performed could be tweaked to provide a measure of an individual's influence or prominence in the online community of their particular topic. As you will know, I am interested in the concept of 'digital scholarship' and in trying to get the type of online activity many of us engage in recognised as scholarly work.

The problem has been a mismatch between what has traditionally been measured to indicate academic standing in a particular field and the type of activity that takes place online. In short, the formal systems such as the RAE or the REF in the UK, are focused on outputs. They make a nod towards impact, or reputation, but it's really outputs that they want - and we largely know what these look like in this world: Refereed papers in respected journals; Keynote speeches; Gaining large research bids; etc. These don't map well onto the digital world - is 1000 blog posts good? What about 1000 subscribers? Is having a large twitter network a sign or standing? All too easily such measures fall down. So I have been looking for something more robust which might act as a metric for measuring an individual's reputation in their subject area.

The report the Comms team provided me gets some way towards this. They chose the subject area of 'distance learning' really as a test to see how it worked - one would expect the Open University to come out well in this. Here is the blurb from the company on how they determine influence:

"a stakeholder of a topic is “an entity (individual or organisation) who is sufficiently referenced in the context of the topic” .
When performing a MarketInfluence study our computer systems initially collect any documents on the Internet (web pages, Word, pdf or PowerPoint documents) which match a defined search phrase.
These documents (often hundreds of thousands) are then analysed. The analysis is especially focused on identifying who references whom.
Based on these references it is possible to calculate influence."

Influencers

They then apply a number of filters to prevent self-citation. From this they drew up the following list of top 100 influencers in 'distance learning':

I come out 4th, Brian Kelly 6th and Tony 10th. I'm sure that Tony and I both gain some credit benefit from being associated with the OU, which drags up our distance learning status, but even so the mix of individuals in with large professional output such as the Guardian, JISC and BBC is interesting - the space has become democratised to an extent.

Betweeners
They then have a measure of friends in common, what they term 'betweenness':

"Stakeholders with high Betweenness are “stations” where information (on the issue in focus) is passed via in order to reach the constituency of said stakeholder. "

Here is the table for those with high betweenness:

So Brian, Tony, Alan, Grainne and I are all good conduits for information (in this narrow domain). Individuals seem to work better as betweeners than organisations it seems.

Hubness
They reference Gladwell's Tipping Point and his notion of 'connectors', to suggest some sites/people have a high level of 'hubness': "Hubness is the characteristic of disproportionately linking to those who are authoritative on a given topic."

Again, people tend to be better hubs than organisations. Oh, and from now on you MUST address me as 'your hubness'.

It is still problematic, and could be gamed. I don't know enough about the algorithms used to assess this. One would also need to be careful about the search term used - 'distance learning' is quite a niche, UK term I think (had it been 'educational technology' we would have had a very different list). But, all this isn't (just) about ego massage - it strikes me that if we could develop such an algorithm so that we could easily enter any subject domain, this would provide a useful tool for measuring an individual's online influence/reputation/status etc in their field. This would then provide evidence for justifying this type of work and in seeking promotion. It could offer us the Alt-REF I was after. At the moment this work belongs to the consultants, and we would want to tweak it for academic use, but it does suggest that such an analysis is possible. A JISC project to develop the service for all academics?

What this gets at is that online activity is different - it is less focused around outputs and more around overall activity and reputation. And it does begin to back up what I've always felt - that this stuff isn't just peripheral, playing around, but increasingly is significant to individuals and organisations. I wouldn't want to try, but one could think about it in monetary terms - how much is this influence worth if you were to try and buy it (through advertising or other means?).

Posted at 02:16 PM in digital scholarship | Permalink

Comments

Martin,

Great post! This is so very important and I only wish the higher education sector opened their eyes to what 'impact' and 'engagement' is actually is about. Connections are the mean by which outputs are even possible!

I can only talk about the approach taken by business schools as my frame of reference. To give you an idea of how far behind business schools are in measuring impact and engagement even hardcopy textbooks are poorly regarded in business - unless you are the market leader and in your 15th edition. Despite the fact they reach 1000's of students every day with their words.

The RAE biases business schools to focus only on outputs of scholarly academic articles that are read by the very elite few within our very sector. The pressure is especially apparent at junior levels of the academic community.

But what about impact and engagement with business, and the wider social community that business is embeded within! I don't just want my students or peers to read/comment on my work, I want people who do business every day to read and comment too!

So as far as encouraging, supporting and measuring the use of digital technologies - twitter, blog posts, podcasts etc - as a means to engage, share what we research and it's impact on wider audiences outside students and other academics - it is just not on the agenda.

But it should be!

Thanks for your post and as for your concerns over the methodology in how the above was generated, I for one has more confidence in the above statistics, than I do the rigiour of the academic peer review process and the means by which journals are regarded as top tier (5*). Oh, and the RAE!

Thanks!

P.S. Keep blogging!

Posted by: Dr. Kelly Page from Cardiff Business School. | 19/06/2009 at 02:49 PM

The methodology looks very odd to me... a Google search for ""distance learning" site:http://ukwebfocus.wordpress.com/" returns 3 hits and neither that phrase nor 'learning' appears in Brian's blog's tag cloud.

I understand that somebody can be well connected without actually writing on the topic at hand - but does that imply 'impact'? What value is there in simply measuring 'connectedness'? Not a lot I would say. Just look at Twitter - connectedness is cheap.

Further, it seems to me that the point of contention (or at least, the significant point of influence over the results) here is in the initial selection of the resources being analysed. No info is given about how this is done.

Sorry... not meaning to pick on Brian here but the appearance of his blog, given this particular choice of topic, stuck out a little.

To pick on a different target... is Moodle really the second most significant influencer in the sphere of distance learning?

Perhaps I'm missing the point that you are trying to make here. I presume we want a measure of 'impact' to be about more than simply a popularity contest? Activity and popularity != reputation and impact. Life isn't that simple and measures like this tend to reduce measurements to the simplistic imho.

The bottom line is, REF might be bad but we've got a long long way to go to beat it in the academic space and this certainly doesn't come close - no?

Posted by: Andy Powell | 19/06/2009 at 05:30 PM

@Andy - because I don't have access to their methodology, I can't really defend it, but I think you may be being overly harsh. To come back on some of your points:
They are measuring the 'influence' of web sites, so I think we could argue that connectedness does matter. As I understand it they are measuring the semantic links, so the relevance of that site to the links provided. I'm guessing Brian crops up because lots of people who declare themselves to be distance learning link to him.
As for Moodle, the forums and the product itself are arguably major influencers in distance ed.
I'm not sure to what extent they pre-selected the sites to analyse or whether these are the ones that arise. Their initial spiel seems to suggest the latter, but I wonder if it isn't a combination of the two.
My contention was that we could start to build on things like this. I think they begin to go beyond just popularity, but we could extend that more. Certainly if you were an academic making a case for promotion that you had an online reputation in your subject, say 'Climate change', then I think you would expect to be appearing on a list like this if you want to back up that claim.
I think the REF is as subject to your criticisms as this approach, but my guess is we would never really solely on an automatic ranking - it would have to be a mixture of peer review as well. But the point I wanted to make was that we can point at significant outputs for the REF, but online is more ephemeral - it may not be one particular post that is significant, but the cumulative effect. And I feel it would be useful to find means of measuring this reliably across subject domains.

Posted by: Martin | 19/06/2009 at 05:48 PM

In this case, knowing the methodology or not is almost neither here nor there because the results so clearly speak for themselves. Someone (in this case Brian) who has only written three blog posts which even mention the phrase "distance learning" (one of which is a guest post and two of which are where that phrase appears only in a comment by someone else) is flagged as being the 6th highest 'influencer' in the field of 'distance learning'. I'm sorry... this is nonesense, pure and simple!

Brian certainly is an influencer... but not in this particular field.

What Brian is, is very well connected to other people in the sphere of ICT. On the basis of this methodology he would probably turn out to be a high-ranking 'influencer' in almost any ICT-related subject you care to mention. Does that really tell us anything useful?

As I said, being well connected to people that write on a particular topic does not necessarily make you an influencer in that area - because your connections make be based on something completely different.

I don't see how there has been any attempt to measure 'semantic links' in this case - the evidence speaks for itself.

The reason I feel strongly about this is that I totally agree with your final statement: "But the point I wanted to make was that we can point at significant outputs for the REF, but online is more ephemeral - it may not be one particular post that is significant, but the cumulative effect. And I feel it would be useful to find means of measuring this reliably across subject domains".

The danger is that if you promote this kind of superficial bean-counting as a potential means of reliably measuring cumulative effect across domains then you do more harm than good - because it is so obviously broken.

Influence (or impact or whatever one chooses to call it) is a combination of having something to say on a topic coupled with the ability to get that message to a particular audience.

I don't think the size of the audience (the level of connectedness) necessarily determines the level of influence - an influential person might only need to get their message to one other person (if that person happens to be the VC of a university, or the prime minister, or whatever). And, clearly, what is being said determines whether the influence/impact is good or bad - the doctor behind the MMR scare in the UK was presumably well connected for example. Clearly, this is highly subjective, which is why peer-review remains so important.

Now... I accept that the level of someone's connectedness is indicative (in some way) of both their ability to get their message out to people, and people's 'trust' in that message. But, and this is a very big 'but', I think we are a long way from being able to measure any of this in any sensible/meaningful way.

More importantly, I think the results you show above are better seen as an example of the fact we can't measure this stuff easily, than an indicator that we can do something useful.

Sorry.

Posted by: Andy Powell | 20/06/2009 at 07:46 AM

A couple of points - it can't just be popularity based can it, as the BBC would clearly win each time. It must be about the nature of the links and citations (the semantic bit). So although you said 'connecting is easy' it's not just any connections that count (just as we accept that not all outputs are easy).
I have some suspicions about things that are happening in this algorithm:
1) A black hole effect - The OU is the main attractor, so association with it drags others up (I benefit a lot from this, if I was a prof at another uni mine wouldn't be as high I'll wager)
2) An echo chamber effect - me, Brian, Tony, Grainne etc link a lot to each other, so maybe there is a positive reinforcement effect.
3) The term is an odd one (it's not 'online education' for example) so maybe it throws up some odd results.
BUT - two things:
1) Maybe when we get an unexpected result it is telling us something about the audience of that site and we should explore it rather than dismiss it.
2) I was only ever arguing that we could use such things as a basis to build out from (and deal with issues such as the three I've raised), and make it more HE focused. Your conclusion seems to be 'this one isn't perfect, so we'll abandon the whole exercise and revert to the REF'. There's more we can do than that isn't there, surely?

Posted by: Martin | 20/06/2009 at 08:02 AM

Re: "abandon the whole exercise and revert to the REF" ...

No, not quite! :-) Just "treat with caution and a very heavy dollop of cynicism"!

Re your second 2) above... I think we want to get to the same place (some sensible measure of scholarly impact on the social Web) but I disagree with you that this is a helpful basis on which to build.

I'd start by stepping back and asking:

- what do we want to measure?

- what can we measure?

Can we bring these things close enough together to create a useful measure of scholarly impact?

Posted by: AndyP | 20/06/2009 at 08:30 AM

Re: "it can't just be popularity based can it, as the BBC would clearly win each time"?

Well no... I presume there's some bi-directional measure of both inbound and outbound connectedness? (I'm guessing obviously). The BBC is rather uni-directional?

I'm also not clear if you mean "win each time" in the context of "distance learning" or "win each time" more generally. If the latter, no, I wouldn't.

Posted by: AndyP | 20/06/2009 at 08:43 AM

" I think we want to get to the same place (some sensible measure of scholarly impact on the social Web) but I disagree with you that this is a helpful basis on which to build."

Yes, I think you're right. And I accept that there may be some ego in this - I think this has some validity because I came out well in it. But even so, the three problems I list in my next post have come to me by considering this, and so any metric we developed would need to avoid these. This wouldn't have occurred to me if we hadn't this to build on. So, in this sense at least, it's useful.

And, I do genuinely appreciate your comments here (even if we are disagreeing) - it has really helped stretch my thinking on this by being challenged, so thanks.

Posted by: Martin | 20/06/2009 at 09:45 AM

Re: Andy: "Clearly, this (digital connectedness) is highly subjective, which is why peer-review remains so important."

Hey both! Great discussion. I agree (and disagree) with points on both sides and feel increasingly we need in academia a combination of a number of metholodolgies to really ascertain not just the number of outputs, but also the value of the works/person be it impact and contribution of scholarly work in both offline and digital networks.

In response to the statement by Andy above, the peer-review system upon which the RAE (future REF) is based is highly highly subjective. The very nature of how we evaluate the ranking of journals denotes an onous on historicy, popularity and reach of the journal, than its valued contribution to academic knowledge and it's wider impact to society. It also discriminates against cross-discipline work - if you are employed by a media department and publish in an IT journal, chances are it will not work in your favour!

Further, how one comes to be included amongst those pages of top-tier journals is also highly subjective. It's a game we play as we know it is the means by which our performance is evaluated. But this doesn't make it objective nor the best method by which to evaluate the 'impact' of outputs or the reach of that impact.

Did you know under the current system leading scholars in their fields would not have been recognised for their contribution or impact! For example, Charles Darwin would not have been included in the RAE/REF as he took over 5 years to collect his data and nearly 22 years to write/publish his theory on evolution. The RAE cycle is 7-8 years in which a min of 4 top outputs are stressed!

In contrast, Einstein, a scholar who wrote hundreds of books/articles, his first writings in 1905 (today regarded as tremendous achievements) were shunned by the academic physics community of the time. He wasn't connected to the academic community but worked outside it!

It was the work of astronomers and the media that gave his 1911 theories on general relativity international acclaim in 1919. Einstein developed as theories while working for the Swiss patent office (1903-1911)- not in academia! He was somewhat shunned by the physics community, esp. following the international media acclaim for his work in 1911.

So my point ... in HE we have developed an evaluation system that constrains academic voice and creative intellectual freedom as we are focused on outputs NOT connections. We need both!

A combination of measures that include measurement of outputs for scholary impact to the academic community; coupled with measures to ascertain connectedness/impact in academic networks; and increasingly wider social networks is also required.

In this I agree with Martin, that this is a good base from which to start the dialogue on how one measures digital connectedness to encourage acdemic voice in a digital space. But I also agree with yourself Andy, in that quality (the semantic context of those connections) is very important! How we do this is yet for another discussion.

Like with Twitter, it is not the number of followers or the number of updates, but the quality and value of those connections that is important!

If only the RAE (REF) thought the same!

Smiles
Kelly

Posted by: Kelly Page | 22/06/2009 at 02:56 PM

So what you are after is a kind of virtual thesaurus where the words are the people and the synonyms and antonyms are the areas of interest/influence, or lack of?

Posted by: Peter Lythgoe | 22/06/2009 at 04:26 PM

Really interesting conversation - liking the thread of comments.

I'm slightly suspicious about the original system and its so-called "semantic intelligence". Not because it can't be done - in fact, the opposite - it just looks like it hasn't been done that well..

I reckon you could get the kinds of results above simply by doing some simple Google API pagerank type munging. But I reckon if you extended it so that you had some kind of intelligent Calais-esque calls, some language parsing and a bit of knowledge about the way that sites like LinkedIn worked - then I think you'd avoid quite a lot of the flaws that Andy has outlined above.

...which is not to say that a human element isn't the easiest / best way of doing it - just that the technical bit shouldn't be *that* hard.

Not that I'm offering to build it, btw :-)

Posted by: Mike Ellis | 22/06/2009 at 04:39 PM

You Tube if you want to?

Interesting stuff as always Martin although I feel a but coming on ...

I have no doubt there will be those who go for the new network dynamics of hubs and influence but hasn't this always been so with academics going the route of the public intellectual using the available media. In the past it was McLuhan or Foucault on TV or any number of others. Now it is academics on blogs e.g. dannah boyd who was a well known public intellectual even before completing her PhD.

What of the bulk of other academics interested in research and publication? They may move to open access routes for publication but will they want to spend their time in developing a public persona? I am not sure they or I will. I haven't yet developed a blog, though I follow others, including yours. I am not sure that digital scholarship covers all or even the main aspects of intellectual endeavour. Sometimes it is a lone academic quarrying away obscurely on a narrow point that makes a difference. Some of the dynamics of intellectual life require a position outside of the public gaze.

Secondly hubs need audiences to be hubs. A hub is a power (law) relationship in which a small number of nodes have many links while most have few. In short we can't all be hubs. For hubs to exist there has a to be a more passive audience and the dynamic of networks re-inforces this.

So a note of caution I guess. I and many others while interested are not likely to to rush in to become 'digital scholars' in anything like the full sense, even if we do provide an audience for those who will.

[First attempt to post this was Friday 19th so sorry if it is out of sequence with other comments]

Posted by: Chris Jones | 23/06/2009 at 02:01 PM

The comments to this entry are closed.