Rather than just using a string for PubDate, we attempt to parse it.
This includes a couple of crazy non-standard time formats that I've seen
in the wild.
Breaking change: Item.PubDate is no longer a string, it is time.Time.
In this iteration the key passed to the database is the Title which is
obviously silly.
I'm still looking for a configurable way of generating the unique key.
This is a really simple map. Calls to it just check if the key exists
and return a bool. In case of a false it adds the key.
The key is just a string which might or might not be sufficient.
Turns Item.Guid field into a string pointer, so it may properly
be set to nil when applicable. Adjusts remaining code and tests
to reflect this change.
Due to the PR jteeuwen/go-pkg-xmlx#16, when using SelectNodes there is no
longer any hidden recursion. Due to RSS's structure, there is a root document
node. These two pieces break the RSS parsing. Fix by first selecting the
root document node, and then selecting the channels for parsing.
Due to recent changes involving how values are dealt with in xmlx, update
the RSS/Atom parsing. Instead of using the Value property of an xmlx Node,
use the new GetValue function offered in PR jteeuwen/go-pkg-xmlx#15.
The haveItem check for RSS and Atom causes the feeds to act unexpectedly.
For RSS, the checked tags don't necessarily have to be unique. For atom,
it is allowed to have duplicate items (including duplicated ids) in one feed,
so this shouldn't be stopped either.
In RSS feeds, the author of an item was always overwritten by a non-standard
creator tag. Change this so that creator is only used when it actually
appears. Otherwise use the previous value of Author, whatever that is.