[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7. Scoring

Other people use kill files, but we here at Gnus Towers like scoring better than killing, so we'd rather switch than fight. They do something completely different as well, so sit up straight and pay attention!

All articles have a default score (gnus-summary-default-score), which is 0 by default. This score may be raised or lowered either interactively or by score files. Articles that have a score lower than gnus-summary-mark-below are marked as read.

Gnus will read any score files that apply to the current group before generating the summary buffer.

There are several commands in the summary buffer that insert score entries based on the current article. You can, for instance, ask Gnus to lower or increase the score of all articles with a certain subject.

There are two sorts of scoring entries: Permanent and temporary. Temporary score entries are self-expiring entries. Any entries that are temporary and have not been used for, say, a week, will be removed silently to help keep the sizes of the score files down.

7.1 Summary Score Commands  Adding score entries for the current group.
7.2 Group Score Commands  General score commands.
7.3 Score Variables  Customize your scoring. (My, what terminology).
7.4 Score File Format  What a score file may contain.
7.5 Score File Editing  You can edit score files by hand as well.
7.6 Adaptive Scoring  Big Sister Gnus knows what you read.
7.7 Home Score File  How to say where new score entries are to go.
7.8 Followups To Yourself  Having Gnus notice when people answer you.
7.9 Scoring On Other Headers  Scoring on non-standard headers.
7.10 Scoring Tips  How to score effectively.
7.11 Reverse Scoring  That problem child of old is not problem.
7.12 Global Score Files  Earth-spanning, ear-splitting score files.
7.13 Kill Files  They are still here, but they can be ignored.
7.14 Converting Kill Files  Translating kill files to score files.
7.15 GroupLens  Getting predictions on what you like to read.
7.16 Advanced Scoring  Using logical expressions to build score rules.
7.17 Score Decays  It can be useful to let scores wither away.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.1 Summary Score Commands

The score commands that alter score entries do not actually modify real score files. That would be too inefficient. Gnus maintains a cache of previously loaded score files, one of which is considered the current score file alist. The score commands simply insert entries into this list, and upon group exit, this list is saved.

The current score file is by default the group's local score file, even if no such score file actually exists. To insert score commands into some other score file (e.g. `all.SCORE'), you must first make this score file the current one.

General score commands that don't actually change the score file:

V s
Set the score of the current article (gnus-summary-set-score).

V S
Display the score of the current article (gnus-summary-current-score).

V t
Display all score rules that have been used on the current article (gnus-score-find-trace). In the *Score Trace* buffer, you may type e to edit score file corresponding to the score rule on current line and f to format (gnus-score-pretty-print) the score file and edit it.

V w
List words used in scoring (gnus-score-find-favourite-words).

V R
Run the current summary through the scoring process (gnus-summary-rescore). This might be useful if you're playing around with your score files behind Gnus' back and want to see the effect you're having.

V c
Make a different score file the current (gnus-score-change-score-file).

V e
Edit the current score file (gnus-score-edit-current-scores). You will be popped into a gnus-score-mode buffer (see section 7.5 Score File Editing).

V f
Edit a score file and make this score file the current one (gnus-score-edit-file).

V F
Flush the score cache (gnus-score-flush-cache). This is useful after editing score files.

V C
Customize a score file in a visually pleasing manner (gnus-score-customize).

The rest of these commands modify the local score file.

V m
Prompt for a score, and mark all articles with a score below this as read (gnus-score-set-mark-below).

V x
Prompt for a score, and add a score rule to the current score file to expunge all articles below this score (gnus-score-set-expunge-below).

The keystrokes for actually making score entries follow a very regular pattern, so there's no need to list all the commands. (Hundreds of them.)

  1. The first key is either I (upper case i) for increasing the score or L for lowering the score.
  2. The second key says what header you want to score on. The following keys are available:
    a
    Score on the author name.

    s
    Score on the subject line.

    x
    Score on the Xref line--i.e., the cross-posting line.

    r
    Score on the References line.

    d
    Score on the date.

    l
    Score on the number of lines.

    i
    Score on the Message-ID header.

    e
    Score on an "extra" header, that is, one of those in gnus-extra-headers, if your NNTP server tracks additional header data in overviews.

    f
    Score on followups--this matches the author name, and adds scores to the followups to this author. (Using this key leads to the creation of `ADAPT' files.)

    b
    Score on the body.

    h
    Score on the head.

    t
    Score on thread. (Using this key leads to the creation of `ADAPT' files.)

  3. The third key is the match type. Which match types are valid depends on what headers you are scoring on.

    strings

    e
    Exact matching.

    s
    Substring matching.

    f
    Fuzzy matching (see section 8.18 Fuzzy Matching).

    r
    Regexp matching

    date
    b
    Before date.

    a
    After date.

    n
    This date.

    number
    <
    Less than number.

    =
    Equal to number.

    >
    Greater than number.

  4. The fourth and usually final key says whether this is a temporary (i.e., expiring) score entry, or a permanent (i.e., non-expiring) score entry, or whether it is to be done immediately, without adding to the score file.
    t
    Temporary score entry.

    p
    Permanent score entry.

    i
    Immediately scoring.

  5. If you are scoring on `e' (extra) headers, you will then be prompted for the header name on which you wish to score. This must be a header named in gnus-extra-headers, and `TAB' completion is available.

So, let's say you want to increase the score on the current author with exact matching permanently: I a e p. If you want to lower the score based on the subject line, using substring matching, and make a temporary score entry: L s s t. Pretty easy.

To make things a bit more complicated, there are shortcuts. If you use a capital letter on either the second or third keys, Gnus will use defaults for the remaining one or two keystrokes. The defaults are "substring" and "temporary". So I A is the same as I a s t, and I a R is the same as I a r t.

These functions take both the numerical prefix and the symbolic prefix (see section 8.3 Symbolic Prefixes). A numerical prefix says how much to lower (or increase) the score of the article. A symbolic prefix of a says to use the `all.SCORE' file for the command instead of the current score file.

The gnus-score-mimic-keymap says whether these commands will pretend they are keymaps or not.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.2 Group Score Commands

There aren't many of these as yet, I'm afraid.

W f
Gnus maintains a cache of score alists to avoid having to reload them all the time. This command will flush the cache (gnus-score-flush-cache).

You can do scoring from the command line by saying something like:

 
$ emacs -batch -l ~/.emacs -l ~/.gnus.el -f gnus-batch-score


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.3 Score Variables

gnus-use-scoring
If nil, Gnus will not check for score files, and will not, in general, do any score-related work. This is t by default.

gnus-kill-killed
If this variable is nil, Gnus will never apply score files to articles that have already been through the kill process. While this may save you lots of time, it also means that if you apply a kill file to a group, and then change the kill file and want to run it over you group again to kill more articles, it won't work. You have to set this variable to t to do that. (It is t by default.)

gnus-kill-files-directory
All kill and score files will be stored in this directory, which is initialized from the SAVEDIR environment variable by default. This is `~/News/' by default.

gnus-score-file-suffix
Suffix to add to the group name to arrive at the score file name (`SCORE' by default.)

gnus-score-uncacheable-files
All score files are normally cached to avoid excessive re-loading of score files. However, if this might make your Emacs grow big and bloated, so this regexp can be used to weed out score files unlikely to be needed again. It would be a bad idea to deny caching of `all.SCORE', while it might be a good idea to not cache `comp.infosystems.www.authoring.misc.ADAPT'. In fact, this variable is `ADAPT$' by default, so no adaptive score files will be cached.

gnus-save-score
If you have really complicated score files, and do lots of batch scoring, then you might set this variable to t. This will make Gnus save the scores into the `.newsrc.eld' file.

If you do not set this to t, then manual scores (like those set with V s (gnus-summary-set-score)) will not be preserved across group visits.

gnus-score-interactive-default-score
Score used by all the interactive raise/lower commands to raise/lower score with. Default is 1000, which may seem excessive, but this is to ensure that the adaptive scoring scheme gets enough room to play with. We don't want the small changes from the adaptive scoring to overwrite manually entered data.

gnus-summary-default-score
Default score of an article, which is 0 by default.

gnus-summary-expunge-below
Don't display the summary lines of articles that have scores lower than this variable. This is nil by default, which means that no articles will be hidden. This variable is local to the summary buffers, and has to be set from gnus-summary-mode-hook.

gnus-score-over-mark
Mark (in the third column) used for articles with a score over the default. Default is `+'.

gnus-score-below-mark
Mark (in the third column) used for articles with a score below the default. Default is `-'.

gnus-score-find-score-files-function
Function used to find score files for the current group. This function is called with the name of the group as the argument.

Predefined functions available are:

gnus-score-find-single
Only apply the group's own score file.

gnus-score-find-bnews
Apply all score files that match, using bnews syntax. This is the default. If the current group is `gnu.emacs.gnus', for instance, `all.emacs.all.SCORE', `not.alt.all.SCORE' and `gnu.all.SCORE' would all apply. In short, the instances of `all' in the score file names are translated into `.*', and then a regexp match is done.

This means that if you have some score entries that you want to apply to all groups, then you put those entries in the `all.SCORE' file.

The score files are applied in a semi-random order, although Gnus will try to apply the more general score files before the more specific score files. It does this by looking at the number of elements in the score file names--discarding the `all' elements.

gnus-score-find-hierarchical
Apply all score files from all the parent groups. This means that you can't have score files like `all.SCORE', but you can have `SCORE', `comp.SCORE' and `comp.emacs.SCORE' for each server.

This variable can also be a list of functions. In that case, all these functions will be called with the group name as argument, and all the returned lists of score files will be applied. These functions can also return lists of lists of score alists directly. In that case, the functions that return these non-file score alists should probably be placed before the "real" score file functions, to ensure that the last score file returned is the local score file. Phu.

For example, to do hierarchical scoring but use a non-server-specific overall score file, you could use the value

 
(list (lambda (group) ("all.SCORE"))
      'gnus-score-find-hierarchical)

gnus-score-expiry-days
This variable says how many days should pass before an unused score file entry is expired. If this variable is nil, no score file entries are expired. It's 7 by default.

gnus-update-score-entry-dates
If this variable is non-nil, temporary score entries that have been triggered (matched) will have their dates updated. (This is how Gnus controls expiry--all non-matched-entries will become too old while matched entries will stay fresh and young.) However, if you set this variable to nil, even matched entries will grow old and will have to face that oh-so grim reaper.

gnus-score-after-write-file-function
Function called with the name of the score file just written.

gnus-score-thread-simplify
If this variable is non-nil, article subjects will be simplified for subject scoring purposes in the same manner as with threading--according to the current value of gnus-simplify-subject-functions. If the scoring entry uses substring or exact matching, the match will also be simplified in this manner.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.4 Score File Format

A score file is an emacs-lisp file that normally contains just a single form. Casual users are not expected to edit these files; everything can be changed from the summary buffer.

Anyway, if you'd like to dig into it yourself, here's an example:

 
(("from"
  ("Lars Ingebrigtsen" -10000)
  ("Per Abrahamsen")
  ("larsi\\|lmi" -50000 nil R))
 ("subject"
  ("Ding is Badd" nil 728373))
 ("xref"
  ("alt.politics" -1000 728372 s))
 ("lines"
  (2 -100 nil <))
 (mark 0)
 (expunge -1000)
 (mark-and-expunge -10)
 (read-only nil)
 (orphan -10)
 (adapt t)
 (files "/hom/larsi/News/gnu.SCORE")
 (exclude-files "all.SCORE")
 (local (gnus-newsgroup-auto-expire t)
        (gnus-summary-make-false-root empty))
 (eval (ding)))

This example demonstrates most score file elements. See section 7.16 Advanced Scoring, for a different approach.

Even though this looks much like Lisp code, nothing here is actually evaled. The Lisp reader is used to read this form, though, so it has to be valid syntactically, if not semantically.

Six keys are supported by this alist:

STRING
If the key is a string, it is the name of the header to perform the match on. Scoring can only be performed on these eight headers: From, Subject, References, Message-ID, Xref, Lines, Chars and Date. In addition to these headers, there are three strings to tell Gnus to fetch the entire article and do the match on larger parts of the article: Body will perform the match on the body of the article, Head will perform the match on the head of the article, and All will perform the match on the entire article. Note that using any of these last three keys will slow down group entry considerably. The final "header" you can score on is Followup. These score entries will result in new score entries being added for all follow-ups to articles that matches these score entries.

Following this key is an arbitrary number of score entries, where each score entry has one to four elements.

  1. The first element is the match element. On most headers this will be a string, but on the Lines and Chars headers, this must be an integer.

  2. If the second element is present, it should be a number--the score element. This number should be an integer in the neginf to posinf interval. This number is added to the score of the article if the match is successful. If this element is not present, the gnus-score-interactive-default-score number will be used instead. This is 1000 by default.

  3. If the third element is present, it should be a number--the date element. This date says when the last time this score entry matched, which provides a mechanism for expiring the score entries. It this element is not present, the score entry is permanent. The date is represented by the number of days since December 31, 1 BCE.

  4. If the fourth element is present, it should be a symbol--the type element. This element specifies what function should be used to see whether this score entry matches the article. What match types that can be used depends on what header you wish to perform the match on.
    From, Subject, References, Xref, Message-ID
    For most header types, there are the r and R (regexp), as well as s and S (substring) types, and e and E (exact match), and w (word match) types. If this element is not present, Gnus will assume that substring matching should be used. R, S, and E differ from the others in that the matches will be done in a case-sensitive manner. All these one-letter types are really just abbreviations for the regexp, string, exact, and word types, which you can use instead, if you feel like.

    Extra
    Just as for the standard string overview headers, if you are using gnus-extra-headers, you can score on these headers' values. In this case, there is a 5th element in the score entry, being the name of the header to be scored. The following entry is useful in your `all.SCORE' file in case of spam attacks from a single origin host, if your NNTP server tracks `NNTP-Posting-Host' in overviews:

     
    ("111.222.333.444" -1000 nil s
     "NNTP-Posting-Host")
    

    Lines, Chars
    These two headers use different match types: <, >, =, >= and <=.

    These predicates are true if

     
    (PREDICATE HEADER MATCH)
    

    evaluates to non-nil. For instance, the advanced match ("lines" 4 <) (see section 7.16 Advanced Scoring) will result in the following form:

     
    (< header-value 4)
    

    Or to put it another way: When using < on Lines with 4 as the match, we get the score added if the article has less than 4 lines. (It's easy to get confused and think it's the other way around. But it's not. I think.)

    When matching on Lines, be careful because some back ends (like nndir) do not generate Lines header, so every article ends up being marked as having 0 lines. This can lead to strange results if you happen to lower score of the articles with few lines.

    Date
    For the Date header we have three kinda silly match types: before, at and after. I can't really imagine this ever being useful, but, like, it would feel kinda silly not to provide this function. Just in case. You never know. Better safe than sorry. Once burnt, twice shy. Don't judge a book by its cover. Never not have sex on a first date. (I have been told that at least one person, and I quote, "found this function indispensable", however.)

    A more useful match type is regexp. With it, you can match the date string using a regular expression. The date is normalized to ISO8601 compact format first---YYYYMMDDTHHMMSS. If you want to match all articles that have been posted on April 1st in every year, you could use `....0401.........' as a match string, for instance. (Note that the date is kept in its original time zone, so this will match articles that were posted when it was April 1st where the article was posted from. Time zones are such wholesome fun for the whole family, eh?)

    Head, Body, All
    These three match keys use the same match types as the From (etc) header uses.

    Followup
    This match key is somewhat special, in that it will match the From header, and affect the score of not only the matching articles, but also all followups to the matching articles. This allows you e.g. increase the score of followups to your own articles, or decrease the score of followups to the articles of some known trouble-maker. Uses the same match types as the From header uses. (Using this match key will lead to creation of `ADAPT' files.)

    Thread
    This match key works along the same lines as the Followup match key. If you say that you want to score on a (sub-)thread started by an article with a Message-ID x, then you add a `thread' match. This will add a new `thread' match for each article that has x in its References header. (These new `thread' matches will use the Message-IDs of these matching articles.) This will ensure that you can raise/lower the score of an entire thread, even though some articles in the thread may not have complete References headers. Note that using this may lead to undeterministic scores of the articles in the thread. (Using this match key will lead to creation of `ADAPT' files.)

mark
The value of this entry should be a number. Any articles with a score lower than this number will be marked as read.

expunge
The value of this entry should be a number. Any articles with a score lower than this number will be removed from the summary buffer.

mark-and-expunge
The value of this entry should be a number. Any articles with a score lower than this number will be marked as read and removed from the summary buffer.

thread-mark-and-expunge
The value of this entry should be a number. All articles that belong to a thread that has a total score below this number will be marked as read and removed from the summary buffer. gnus-thread-score-function says how to compute the total score for a thread.

files
The value of this entry should be any number of file names. These files are assumed to be score files as well, and will be loaded the same way this one was.

exclude-files
The clue of this entry should be any number of files. These files will not be loaded, even though they would normally be so, for some reason or other.

eval
The value of this entry will be evalel. This element will be ignored when handling global score files.

read-only
Read-only score files will not be updated or saved. Global score files should feature this atom (see section 7.12 Global Score Files). (Note: Global here really means global; not your personal apply-to-all-groups score files.)

orphan
The value of this entry should be a number. Articles that do not have parents will get this number added to their scores. Imagine you follow some high-volume newsgroup, like `comp.lang.c'. Most likely you will only follow a few of the threads, also want to see any new threads.

You can do this with the following two score file entries:

 
        (orphan -500)
        (mark-and-expunge -100)

When you enter the group the first time, you will only see the new threads. You then raise the score of the threads that you find interesting (with I T or I S), and ignore (C y) the rest. Next time you enter the group, you will see new articles in the interesting threads, plus any new threads.

I.e.---the orphan score atom is for high-volume groups where a few interesting threads which can't be found automatically by ordinary scoring rules exist.

adapt
This entry controls the adaptive scoring. If it is t, the default adaptive scoring rules will be used. If it is ignore, no adaptive scoring will be performed on this group. If it is a list, this list will be used as the adaptive scoring rules. If it isn't present, or is something other than t or ignore, the default adaptive scoring rules will be used. If you want to use adaptive scoring on most groups, you'd set gnus-use-adaptive-scoring to t, and insert an (adapt ignore) in the groups where you do not want adaptive scoring. If you only want adaptive scoring in a few groups, you'd set gnus-use-adaptive-scoring to nil, and insert (adapt t) in the score files of the groups where you want it.

adapt-file
All adaptive score entries will go to the file named by this entry. It will also be applied when entering the group. This atom might be handy if you want to adapt on several groups at once, using the same adaptive file for a number of groups.

local
The value of this entry should be a list of (var value) pairs. Each var will be made buffer-local to the current summary buffer, and set to the value specified. This is a convenient, if somewhat strange, way of setting variables in some groups if you don't like hooks much. Note that the value won't be evaluated.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.5 Score File Editing

You normally enter all scoring commands from the summary buffer, but you might feel the urge to edit them by hand as well, so we've supplied you with a mode for that.

It's simply a slightly customized emacs-lisp mode, with these additional commands:

C-c C-c
Save the changes you have made and return to the summary buffer (gnus-score-edit-done).

C-c C-d
Insert the current date in numerical format (gnus-score-edit-insert-date). This is really the day number, if you were wondering.

C-c C-p
The adaptive score files are saved in an unformatted fashion. If you intend to read one of these files, you want to pretty print it first. This command (gnus-score-pretty-print) does that for you.

Type M-x gnus-score-mode to use this mode.

gnus-score-menu-hook is run in score mode buffers.

In the summary buffer you can use commands like V f, V e and V t to begin editing score files.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.6 Adaptive Scoring

If all this scoring is getting you down, Gnus has a way of making it all happen automatically--as if by magic. Or rather, as if by artificial stupidity, to be precise.

When you read an article, or mark an article as read, or kill an article, you leave marks behind. On exit from the group, Gnus can sniff these marks and add score elements depending on what marks it finds. You turn on this ability by setting gnus-use-adaptive-scoring to t or (line). If you want score adaptively on separate words appearing in the subjects, you should set this variable to (word). If you want to use both adaptive methods, set this variable to (word line).

To give you complete control over the scoring process, you can customize the gnus-default-adaptive-score-alist variable. For instance, it might look something like this:

 
(setq gnus-default-adaptive-score-alist
  '((gnus-unread-mark)
    (gnus-ticked-mark (from 4))
    (gnus-dormant-mark (from 5))
    (gnus-del-mark (from -4) (subject -1))
    (gnus-read-mark (from 4) (subject 2))
    (gnus-expirable-mark (from -1) (subject -1))
    (gnus-killed-mark (from -1) (subject -3))
    (gnus-kill-file-mark)
    (gnus-ancient-mark)
    (gnus-low-score-mark)
    (gnus-catchup-mark (from -1) (subject -1))))

As you see, each element in this alist has a mark as a key (either a variable name or a "real" mark--a character). Following this key is a arbitrary number of header/score pairs. If there are no header/score pairs following the key, no adaptive scoring will be done on articles that have that key as the article mark. For instance, articles with gnus-unread-mark in the example above will not get adaptive score entries.

Each article can have only one mark, so just a single of these rules will be applied to each article.

To take gnus-del-mark as an example--this alist says that all articles that have that mark (i.e., are marked with `e') will have a score entry added to lower based on the From header by -4, and lowered by Subject by -1. Change this to fit your prejudices.

If you have marked 10 articles with the same subject with gnus-del-mark, the rule for that mark will be applied ten times. That means that that subject will get a score of ten times -1, which should be, unless I'm much mistaken, -10.

If you have auto-expirable (mail) groups (see section 6.3.9 Expiring Mail), all the read articles will be marked with the `E' mark. This'll probably make adaptive scoring slightly impossible, so auto-expiring and adaptive scoring doesn't really mix very well.

The headers you can score on are from, subject, message-id, references, xref, lines, chars and date. In addition, you can score on followup, which will create an adaptive score entry that matches on the References header using the Message-ID of the current article, thereby matching the following thread.

If you use this scheme, you should set the score file atom mark to something small--like -300, perhaps, to avoid having small random changes result in articles getting marked as read.

After using adaptive scoring for a week or so, Gnus should start to become properly trained and enhance the authors you like best, and kill the authors you like least, without you having to say so explicitly.

You can control what groups the adaptive scoring is to be performed on by using the score files (see section 7.4 Score File Format). This will also let you use different rules in different groups.

The adaptive score entries will be put into a file where the name is the group name with gnus-adaptive-file-suffix appended. The default is `ADAPT'.

When doing adaptive scoring, substring or fuzzy matching would probably give you the best results in most cases. However, if the header one matches is short, the possibility for false positives is great, so if the length of the match is less than gnus-score-exact-adapt-limit, exact matching will be used. If this variable is nil, exact matching will always be used to avoid this problem.

As mentioned above, you can adapt either on individual words or entire headers. If you adapt on words, the gnus-default-adaptive-word-score-alist variable says what score each instance of a word should add given a mark.

 
(setq gnus-default-adaptive-word-score-alist
      `((,gnus-read-mark . 30)
        (,gnus-catchup-mark . -10)
        (,gnus-killed-mark . -20)
        (,gnus-del-mark . -15)))

This is the default value. If you have adaption on words enabled, every word that appears in subjects of articles marked with gnus-read-mark will result in a score rule that increase the score with 30 points.

Words that appear in the gnus-default-ignored-adaptive-words list will be ignored. If you wish to add more words to be ignored, use the gnus-ignored-adaptive-words list instead.

Some may feel that short words shouldn't count when doing adaptive scoring. If so, you may set gnus-adaptive-word-length-limit to an integer. Words shorter than this number will be ignored. This variable defaults to nil.

When the scoring is done, gnus-adaptive-word-syntax-table is the syntax table in effect. It is similar to the standard syntax table, but it considers numbers to be non-word-constituent characters.

If gnus-adaptive-word-minimum is set to a number, the adaptive word scoring process will never bring down the score of an article to below this number. The default is nil.

If gnus-adaptive-word-no-group-words is set to t, gnus won't adaptively word score any of the words in the group name. Useful for groups like `comp.editors.emacs', where most of the subject lines contain the word `emacs'.

After using this scheme for a while, it might be nice to write a gnus-psychoanalyze-user command to go through the rules and see what words you like and what words you don't like. Or perhaps not.

Note that the adaptive word scoring thing is highly experimental and is likely to change in the future. Initial impressions seem to indicate that it's totally useless as it stands. Some more work (involving more rigorous statistical methods) will have to be done to make this useful.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.7 Home Score File

The score file where new score file entries will go is called the home score file. This is normally (and by default) the score file for the group itself. For instance, the home score file for `gnu.emacs.gnus' is `gnu.emacs.gnus.SCORE'.

However, this may not be what you want. It is often convenient to share a common home score file among many groups--all `emacs' groups could perhaps use the same home score file.

The variable that controls this is gnus-home-score-file. It can be:

  1. A string. Then this file will be used as the home score file for all groups.

  2. A function. The result of this function will be used as the home score file. The function will be called with the name of the group as the parameter.

  3. A list. The elements in this list can be:

    1. (regexp file-name). If the regexp matches the group name, the file-name will be used as the home score file.

    2. A function. If the function returns non-nil, the result will be used as the home score file. The function will be called with the name of the group as the parameter.

    3. A string. Use the string as the home score file.

    The list will be traversed from the beginning towards the end looking for matches.

So, if you want to use just a single score file, you could say:

 
(setq gnus-home-score-file
      "my-total-score-file.SCORE")

If you want to use `gnu.SCORE' for all `gnu' groups and `rec.SCORE' for all `rec' groups (and so on), you can say:

 
(setq gnus-home-score-file
      'gnus-hierarchial-home-score-file)

This is a ready-made function provided for your convenience. Other functions include

gnus-current-home-score-file
Return the "current" regular score file. This will make scoring commands add entry to the "innermost" matching score file.

If you want to have one score file for the `emacs' groups and another for the `comp' groups, while letting all other groups use their own home score files:

 
(setq gnus-home-score-file
      ;; All groups that match the regexp "\\.emacs"
      '(("\\.emacs" "emacs.SCORE")
        ;; All the comp groups in one score file
        ("^comp" "comp.SCORE")))

gnus-home-adapt-file works exactly the same way as gnus-home-score-file, but says what the home adaptive score file is instead. All new adaptive file entries will go into the file specified by this variable, and the same syntax is allowed.

In addition to using gnus-home-score-file and gnus-home-adapt-file, you can also use group parameters (see section 2.10 Group Parameters) and topic parameters (see section 2.16.5 Topic Parameters) to achieve much the same. Group and topic parameters take precedence over this variable.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.8 Followups To Yourself

Gnus offers two commands for picking out the Message-ID header in the current buffer. Gnus will then add a score rule that scores using this Message-ID on the References header of other articles. This will, in effect, increase the score of all articles that respond to the article in the current buffer. Quite useful if you want to easily note when people answer what you've said.

gnus-score-followup-article
This will add a score to articles that directly follow up your own article.

gnus-score-followup-thread
This will add a score to all articles that appear in a thread "below" your own article.

These two functions are both primarily meant to be used in hooks like message-sent-hook, like this:

 
(add-hook 'message-sent-hook 'gnus-score-followup-thread)

If you look closely at your own Message-ID, you'll notice that the first two or three characters are always the same. Here's two of mine:

 
<x6u3u47icf.fsf@eyesore.no>
<x6sp9o7ibw.fsf@eyesore.no>

So "my" ident on this machine is `x6'. This can be exploited--the following rule will raise the score on all followups to myself:

 
("references"
 ("<x6[0-9a-z]+\\.fsf\\(_-_\\)?@.*eyesore\\.no>"
  1000 nil r))

Whether it's the first two or first three characters that are "yours" is system-dependent.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.9 Scoring On Other Headers

Gnus is quite fast when scoring the "traditional" headers---`From', `Subject' and so on. However, scoring other headers requires writing a head scoring rule, which means that Gnus has to request every single article from the back end to find matches. This takes a long time in big groups.

Now, there's not much you can do about this for news groups, but for mail groups, you have greater control. In 3.1.2 To From Newsgroups, it's explained in greater detail what this mechanism does, but here's a cookbook example for nnml on how to allow scoring on the `To' and `Cc' headers.

Put the following in your `~/.gnus.el' file.

 
(setq gnus-extra-headers '(To Cc Newsgroups Keywords)
      nnmail-extra-headers gnus-extra-headers)

Restart Gnus and rebuild your nnml overview files with the M-x nnml-generate-nov-databases command. This will take a long time if you have much mail.

Now you can score on `To' and `Cc' as "extra headers" like so: I e s p To RET <your name> RET.

See? Simple.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.10 Scoring Tips

Crossposts
If you want to lower the score of crossposts, the line to match on is the Xref header.
 
("xref" (" talk.politics.misc:" -1000))

Multiple crossposts
If you want to lower the score of articles that have been crossposted to more than, say, 3 groups:
 
("xref"
  ("[^:\n]+:[0-9]+ +[^:\n]+:[0-9]+ +[^:\n]+:[0-9]+"
   -1000 nil r))

Matching on the body
This is generally not a very good idea--it takes a very long time. Gnus actually has to fetch each individual article from the server. But you might want to anyway, I guess. Even though there are three match keys (Head, Body and All), you should choose one and stick with it in each score file. If you use any two, each article will be fetched twice. If you want to match a bit on the Head and a bit on the Body, just use All for all the matches.

Marking as read
You will probably want to mark articles that have scores below a certain number as read. This is most easily achieved by putting the following in your `all.SCORE' file:
 
((mark -100))
You may also consider doing something similar with expunge.

Negated character classes
If you say stuff like [^abcd]*, you may get unexpected results. That will match newlines, which might lead to, well, The Unknown. Say [^abcd\n]* instead.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.11 Reverse Scoring

If you want to keep just articles that have `Sex with Emacs' in the subject header, and expunge all other articles, you could put something like this in your score file:

 
(("subject"
  ("Sex with Emacs" 2))
 (mark 1)
 (expunge 1))

So, you raise all articles that match `Sex with Emacs' and mark the rest as read, and expunge them to boot.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.12 Global Score Files

Sure, other newsreaders have "global kill files". These are usually nothing more than a single kill file that applies to all groups, stored in the user's home directory. Bah! Puny, weak newsreaders!

What I'm talking about here are Global Score Files. Score files from all over the world, from users everywhere, uniting all nations in one big, happy score file union! Ange-score! New and untested!

All you have to do to use other people's score files is to set the gnus-global-score-files variable. One entry for each score file, or each score file directory. Gnus will decide by itself what score files are applicable to which group.

To use the score file `/ftp@ftp.gnus.org:/pub/larsi/ding/score/soc.motss.SCORE' and all score files in the `/ftp@ftp.some-where:/pub/score' directory, say this:

 
(setq gnus-global-score-files
      '("/ftp@ftp.gnus.org:/pub/larsi/ding/score/soc.motss.SCORE"
        "/ftp@ftp.some-where:/pub/score/"))

Simple, eh? Directory names must end with a `/'. These directories are typically scanned only once during each Gnus session. If you feel the need to manually re-scan the remote directories, you can use the gnus-score-search-global-directories command.

Note that, at present, using this option will slow down group entry somewhat. (That is--a lot.)

If you want to start maintaining score files for other people to use, just put your score file up for anonymous ftp and announce it to the world. Become a retro-moderator! Participate in the retro-moderator wars sure to ensue, where retro-moderators battle it out for the sympathy of the people, luring them to use their score files on false premises! Yay! The net is saved!

Here are some tips for the would-be retro-moderator, off the top of my head:

... I wonder whether other newsreaders will support global score files in the future. Snicker. Yup, any day now, newsreaders like Blue Wave, xrn and 1stReader are bound to implement scoring. Should we start holding our breath yet?


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.13 Kill Files

Gnus still supports those pesky old kill files. In fact, the kill file entries can now be expiring, which is something I wrote before Daniel Quinlan thought of doing score files, so I've left the code in there.

In short, kill processing is a lot slower (and I do mean a lot) than score processing, so it might be a good idea to rewrite your kill files into score files.

Anyway, a kill file is a normal emacs-lisp file. You can put any forms into this file, which means that you can use kill files as some sort of primitive hook function to be run on group entry, even though that isn't a very good idea.

Normal kill files look like this:

 
(gnus-kill "From" "Lars Ingebrigtsen")
(gnus-kill "Subject" "ding")
(gnus-expunge "X")

This will mark every article written by me as read, and remove the marked articles from the summary buffer. Very useful, you'll agree.

Other programs use a totally different kill file syntax. If Gnus encounters what looks like a rn kill file, it will take a stab at interpreting it.

Two summary functions for editing a GNUS kill file:

M-k
Edit this group's kill file (gnus-summary-edit-local-kill).

M-K
Edit the general kill file (gnus-summary-edit-global-kill).

Two group mode functions for editing the kill files:

M-k
Edit this group's kill file (gnus-group-edit-local-kill).

M-K
Edit the general kill file (gnus-group-edit-global-kill).

Kill file variables:

gnus-kill-file-name
A kill file for the group `soc.motss' is normally called `soc.motss.KILL'. The suffix appended to the group name to get this file name is detailed by the gnus-kill-file-name variable. The "global" kill file (not in the score file sense of "global", of course) is just called `KILL'.

gnus-kill-save-kill-file
If this variable is non-nil, Gnus will save the kill file after processing, which is necessary if you use expiring kills.

gnus-apply-kill-hook
A hook called to apply kill files to a group. It is (gnus-apply-kill-file) by default. If you want to ignore the kill file if you have a score file for the same group, you can set this hook to (gnus-apply-kill-file-unless-scored). If you don't want kill files to be processed, you should set this variable to nil.

gnus-kill-file-mode-hook
A hook called in kill-file mode buffers.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.14 Converting Kill Files

If you have loads of old kill files, you may want to convert them into score files. If they are "regular", you can use the `gnus-kill-to-score.el' package; if not, you'll have to do it by hand.

The kill to score conversion package isn't included in Gnus by default. You can fetch it from http://www.stud.ifi.uio.no/~larsi/ding-various/gnus-kill-to-score.el.

If your old kill files are very complex--if they contain more non-gnus-kill forms than not, you'll have to convert them by hand. Or just let them be as they are. Gnus will still use them as before.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.15 GroupLens

NOTE: Unfortunately the GroupLens system seems to have shut down, so this section is mostly of historical interest.

GroupLens is a collaborative filtering system that helps you work together with other people to find the quality news articles out of the huge volume of news articles generated every day.

To accomplish this the GroupLens system combines your opinions about articles you have already read with the opinions of others who have done likewise and gives you a personalized prediction for each unread news article. Think of GroupLens as a matchmaker. GroupLens watches how you rate articles, and finds other people that rate articles the same way. Once it has found some people you agree with it tells you, in the form of a prediction, what they thought of the article. You can use this prediction to help you decide whether or not you want to read the article.

7.15.1 Using GroupLens  How to make Gnus use GroupLens.
7.15.2 Rating Articles  Letting GroupLens know how you rate articles.
7.15.3 Displaying Predictions  Displaying predictions given by GroupLens.
7.15.4 GroupLens Variables  Customizing GroupLens.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.15.1 Using GroupLens

To use GroupLens you must register a pseudonym with your local Better Bit Bureau (BBB) is the only better bit in town at the moment.

Once you have registered you'll need to set a couple of variables.

gnus-use-grouplens
Setting this variable to a non-nil value will make Gnus hook into all the relevant GroupLens functions.

grouplens-pseudonym
This variable should be set to the pseudonym you got when registering with the Better Bit Bureau.

grouplens-newsgroups
A list of groups that you want to get GroupLens predictions for.

That's the minimum of what you need to get up and running with GroupLens. Once you've registered, GroupLens will start giving you scores for articles based on the average of what other people think. But, to get the real benefit of GroupLens you need to start rating articles yourself. Then the scores GroupLens gives you will be personalized for you, based on how the people you usually agree with have already rated.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.15.2 Rating Articles

In GroupLens, an article is rated on a scale from 1 to 5, inclusive. Where 1 means something like this article is a waste of bandwidth and 5 means that the article was really good. The basic question to ask yourself is, "on a scale from 1 to 5 would I like to see more articles like this one?"

There are four ways to enter a rating for an article in GroupLens.

r
This function will prompt you for a rating on a scale of one to five.

k
This function will prompt you for a rating, and rate all the articles in the thread. This is really useful for some of those long running giant threads in rec.humor.

The next two commands, n and , take a numerical prefix to be the score of the article you're reading.

1-5 n
Rate the article and go to the next unread article.

1-5 ,
Rate the article and go to the next unread article with the highest score.

If you want to give the current article a score of 4 and then go to the next article, just type 4 n.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.15.3 Displaying Predictions

GroupLens makes a prediction for you about how much you will like a news article. The predictions from GroupLens are on a scale from 1 to 5, where 1 is the worst and 5 is the best. You can use the predictions from GroupLens in one of three ways controlled by the variable gnus-grouplens-override-scoring.

There are three ways to display predictions in grouplens. You may choose to have the GroupLens scores contribute to, or override the regular Gnus scoring mechanism. override is the default; however, some people prefer to see the Gnus scores plus the grouplens scores. To get the separate scoring behavior you need to set gnus-grouplens-override-scoring to 'separate. To have the GroupLens predictions combined with the grouplens scores set it to 'override and to combine the scores set gnus-grouplens-override-scoring to 'combine. When you use the combine option you will also want to set the values for grouplens-prediction-offset and grouplens-score-scale-factor.

In either case, GroupLens gives you a few choices for how you would like to see your predictions displayed. The display of predictions is controlled by the grouplens-prediction-display variable.

The following are valid values for that variable.

prediction-spot
The higher the prediction, the further to the right an `*' is displayed.

confidence-interval
A numeric confidence interval.

prediction-bar
The higher the prediction, the longer the bar.

confidence-bar
Numerical confidence.

confidence-spot
The spot gets bigger with more confidence.

prediction-num
Plain-old numeric value.

confidence-plus-minus
Prediction +/- confidence.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.15.4 GroupLens Variables

gnus-summary-grouplens-line-format
The summary line format used in GroupLens-enhanced summary buffers. It accepts the same specs as the normal summary line format (see section 3.1.1 Summary Buffer Lines). The default is `%U%R%z%l%I%(%[%4L: %-23,23n%]%) %s\n'.

grouplens-bbb-host
Host running the bbbd server. `grouplens.cs.umn.edu' is the default.

grouplens-bbb-port
Port of the host running the bbbd server. The default is 9000.

grouplens-score-offset
Offset the prediction by this value. In other words, subtract the prediction value by this number to arrive at the effective score. The default is 0.

grouplens-score-scale-factor
This variable allows the user to magnify the effect of GroupLens scores. The scale factor is applied after the offset. The default is 1.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.16 Advanced Scoring

Scoring on Subjects and From headers is nice enough, but what if you're really interested in what a person has to say only when she's talking about a particular subject? Or what if you really don't want to read what person A has to say when she's following up to person B, but want to read what she says when she's following up to person C?

By using advanced scoring rules you may create arbitrarily complex scoring patterns.

7.16.1 Advanced Scoring Syntax  A definition.
7.16.2 Advanced Scoring Examples  What they look like.
7.16.3 Advanced Scoring Tips  Getting the most out of it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.16.1 Advanced Scoring Syntax

Ordinary scoring rules have a string as the first element in the rule. Advanced scoring rules have a list as the first element. The second element is the score to be applied if the first element evaluated to a non-nil value.

These lists may consist of three logical operators, one redirection operator, and various match operators.

Logical operators:

&
and
This logical operator will evaluate each of its arguments until it finds one that evaluates to false, and then it'll stop. If all arguments evaluate to true values, then this operator will return true.

|
or
This logical operator will evaluate each of its arguments until it finds one that evaluates to true. If no arguments are true, then this operator will return false.

!
not
¬
This logical operator only takes a single argument. It returns the logical negation of the value of its argument.

There is an indirection operator that will make its arguments apply to the ancestors of the current article being scored. For instance, 1- will make score rules apply to the parent of the current article. 2- will make score rules apply to the grandparent of the current article. Alternatively, you can write ^^, where the number of ^s (carets) says how far back into the ancestry you want to go.

Finally, we have the match operators. These are the ones that do the real work. Match operators are header name strings followed by a match and a match type. A typical match operator looks like `("from" "Lars Ingebrigtsen" s)'. The header names are the same as when using simple scoring, and the match types are also the same.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.16.2 Advanced Scoring Examples

Please note that the following examples are score file rules. To make a complete score file from them, surround them with another pair of parentheses.

Let's say you want to increase the score of articles written by Lars when he's talking about Gnus:

 
((&
  ("from" "Lars Ingebrigtsen")
  ("subject" "Gnus"))
 1000)

Quite simple, huh?

When he writes long articles, he sometimes has something nice to say:

 
((&
  ("from" "Lars Ingebrigtsen")
  (|
   ("subject" "Gnus")
   ("lines" 100 >)))
 1000)

However, when he responds to things written by Reig Eigil Logge, you really don't want to read what he's written:

 
((&
  ("from" "Lars Ingebrigtsen")
  (1- ("from" "Reig Eigir Logge")))
 -100000)

Everybody that follows up Redmondo when he writes about disappearing socks should have their scores raised, but only when they talk about white socks. However, when Lars talks about socks, it's usually not very interesting:

 
((&
  (1-
   (&
    ("from" "redmondo@.*no" r)
    ("body" "disappearing.*socks" t)))
  (! ("from" "Lars Ingebrigtsen"))
  ("body" "white.*socks"))
 1000)

Suppose you're reading a high volume group and you're only interested in replies. The plan is to score down all articles that don't have subject that begin with "Re:", "Fw:" or "Fwd:" and then score up all parents of articles that have subjects that begin with reply marks.

 
((! ("subject" "re:\\|fwd?:" r))
  -200)
((1- ("subject" "re:\\|fwd?:" r))
  200)

The possibilities are endless.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.16.3 Advanced Scoring Tips

The & and | logical operators do short-circuit logic. That is, they stop processing their arguments when it's clear what the result of the operation will be. For instance, if one of the arguments of an & evaluates to false, there's no point in evaluating the rest of the arguments. This means that you should put slow matches (`body', `header') last and quick matches (`from', `subject') first.

The indirection arguments (1- and so on) will make their arguments work on previous generations of the thread. If you say something like:

 
...
(1-
 (1-
  ("from" "lars")))
...

Then that means "score on the from header of the grandparent of the current article". An indirection is quite fast, but it's better to say:

 
(1-
 (&
  ("from" "Lars")
  ("subject" "Gnus")))

than it is to say:

 
(&
 (1- ("from" "Lars"))
 (1- ("subject" "Gnus")))


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

7.17 Score Decays

You may find that your scores have a tendency to grow without bounds, especially if you're using adaptive scoring. If scores get too big, they lose all meaning--they simply max out and it's difficult to use them in any sensible way.

Gnus provides a mechanism for decaying scores to help with this problem. When score files are loaded and gnus-decay-scores is non-nil, Gnus will run the score files through the decaying mechanism thereby lowering the scores of all non-permanent score rules. The decay itself if performed by the gnus-decay-score-function function, which is gnus-decay-score by default. Here's the definition of that function:

 
(defun gnus-decay-score (score)
  "Decay SCORE according to `gnus-score-decay-constant'
and `gnus-score-decay-scale'."
  (let ((n (- score
              (* (if (< score 0) -1 1)
                 (min (abs score)
                      (max gnus-score-decay-constant
                           (* (abs score)
                              gnus-score-decay-scale)))))))
    (if (and (featurep 'xemacs)
             ;; XEmacs' floor can handle only the floating point
             ;; number below the half of the maximum integer.
             (> (abs n) (lsh -1 -2)))
        (string-to-number
         (car (split-string (number-to-string n) "\\.")))
      (floor n))))

gnus-score-decay-constant is 3 by default and gnus-score-decay-scale is 0.05. This should cause the following:

  1. Scores between -3 and 3 will be set to 0 when this function is called.

  2. Scores with magnitudes between 3 and 60 will be shrunk by 3.

  3. Scores with magnitudes greater than 60 will be shrunk by 5% of the score.

If you don't like this decay function, write your own. It is called with the score to be decayed as its only parameter, and it should return the new score, which should be an integer.

Gnus will try to decay scores once a day. If you haven't run Gnus for four days, Gnus will decay the scores four times, for instance.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by XEmacs Webmaster on October, 2 2007 using texi2html