Sunday, March 11, 2012

RSS News Headlines Prompter update.


I made few improvements to my news headlines prompter appliance, so I thought I'd share if anybody cares (or not, have nothing better to do at the moment :-)).
The improvements were mainly done in scripts downloading news and driving the LED matrix message board device, which I modified to run from crontab now. DSL has simplified scheduler, which is made with single threaded perl script. That means if your script is going to run longer that just few seconds or minutes, you need to take into consideration the fact that it will block other scheduled jobs from running. You can also put the scheduled job/script to run in background thus unblocking the MyCron script to parse and run other jobs - just make your script/program aware of other instances that may be already started by cron.
Activation of cron server on DSL is simple. Just added this line in my /opt/bootlocal.sh file:

/usr/bin/perl -w /usr/local/bin/MyCron &

and commented out my previous scripts startup, now they will be activated from crontab file:

#bash ldmtrxmsgbrd! >/dev/null 2>&1 &
#bash getnews! >/dev/null 2>&1

Also, because my DSL does not have DST patch (Busch's legacy), I added time synchronization to the /opt/bootlocal.sh, so the date/time function of the script would be accurate:

sudo /usr/local/bin/gettime.lua nist.expertsmi.com 30

File /opt/crontab now looks like this:

* * * * * * echo Cron timestamp `date` >> /tmp/crontest
* * * * * * /home/dsl/bin/ldmtrxmsgbrd! > /tmp/ldmtrxmsgbrd.trc 2>&1
0,10,20,30,40,50 * * * * * /home/dsl/bin/getnews! > /tmp/getnews.trc 2>&1

The ldmtrxmsgbrd! script (led matrix device driver) is triggered every minute. If no files are present in temporary dedicated directory, it will display date and time on the led matrix device. If there are files, they will be uploaded one by one to the device and script will go to sleep for calculated amount of time after each of them is uploaded (each file is removed from temporary directory after upload). Next instances are simply not going to be triggered by cron server (since it is single-threaded) until the current instance ends its sleep. However ldmtrxmsgbrd! script is capable of detecting already running instance of itself and quit if such is detected, therefore it should work as well with regular (multi-threaded) cron scheduler.
The getnews! script runs every 10 minutes, however if will only download new data when no files are present in dedicated temporary directory. The directory will be empty when ldmtrxmsgbrd! script processes all data files.

Improvements to getnews! script include downloading Atlantic Tropical Weather Outlook RSS from NOAA as well as current conditions and 7-day forecast for my area code. Now the getnews! script looks like this:

#!/bin/bash
# Download RSS feed from BBC - crontab version.
# Parse headlines out to plain text file, split it up
# to 30 lines per file and feed to to ldmtrxmsgbrd! script
# (meaning, store in predefined location /tmp/ldmtrx)

cd ~dsl/bin

mkdir -p /tmp/ldmtrx
cd /tmp
P=`ls /tmp/ldmtrx/ldmtrxnews* 2>/dev/null|head -n1`
if [[ "tt$P" = "tt" ]]
then
        wget -O rss.xml http://en-us.fxfeeds.mozilla.com/en-US/firefox/headlines.xml
        ~dsl/bin/rssparse! rss.xml > headlines_$$.tmp
        cd /tmp/ldmtrx
        split --lines=30 /tmp/headlines_$$.tmp ldmtrxnews
        cd /tmp
        rm headlines_$$.tmp rss.xml

        wget -O rss.xml http://www.nhc.noaa.gov/index-at.xml
        ~dsl/bin/rssparse! rss.xml > headlines_$$.tmp
        cd /tmp/ldmtrx
        split --lines=30 /tmp/headlines_$$.tmp ldmtrxnews2
        cd /tmp
        rm headlines_$$.tmp rss.xml

        wget -O weather.html "http://mobile.weather.gov/port_mp_ns.php?select=3&CityName=Palm%20Harbor&site=TBW&State=FL&warnzone=FLZ050"
        ~dsl/bin/htmlparse! weather.html > headlines_$$.tmp
        cd /tmp/ldmtrx
        split --lines=30 /tmp/headlines_$$.tmp ldmtrxnews3
        cd /tmp
        rm headlines_$$.tmp weather.html

        wget -O weather.html "http://mobile.weather.gov/port_mp_ns.php?select=1&CityName=Palm%20Harbor&site=TBW&State=FL&warnzone=FLZ050"
        ~dsl/bin/htmlparse! weather.html > headlines_$$.tmp
        cd /tmp/ldmtrx
        split --lines=30 /tmp/headlines_$$.tmp ldmtrxnews4
        cd /tmp
        rm headlines_$$.tmp weather.html
fi

Script ldmtrxmsgbrd! that drives led matrix device received improvements as well:

#!/bin/bash
# Crontab version.
# Monitor /tmp/ldmtrx directory for files by ldmtrxnews* pattern.
# If none, send date and time to Led Matrix Scrolling Board and exit.
# If files found, upload texts from the 1st file to the Board in MAUTO mode.
# Sleep for 10 minutes.
# Remove file.

function ShowDate()
{
   DS=`date | awk '{print substr($0,0,16);}'`
   ~dsl/bin/ldmtrxcmd! scrloff ttyS0
   ~dsl/bin/ldmtrxcmd! mext ttyS0
   ~dsl/bin/ldmtrxcmd! stext ttyS0
   ~dsl/bin/ldmtrxcmd! "$DS" ttyS0
}

cd ~dsl/bin
if [[ ! -f /tmp/ldmtrxmsgbrd_sem.tmp ]]
then
        echo "$$" > /tmp/ldmtrxmsgbrd_sem.tmp
        ~dsl/bin/ldmtrxendscript! ttyS0
        if [[ ! -f /tmp/ldmtrxbrd_undetected.tmp ]]
        then
                F=`ls /tmp/ldmtrx/ldmtrxnews* 2>/dev/null|head -n1`
                if [[ -s $F ]]
                then
                        ~dsl/bin/ldmtrxcmd! mauto ttyS0
                        ~dsl/bin/ldmtrxcmd! upl ttyS0
                        ~dsl/bin/loader! $F ttyS0
                        ~dsl/bin/ldmtrxcmd! @EOT ttyS0
                        ~dsl/bin/ldmtrxcmd! scrlon ttyS0
                        ~dsl/bin/ldmtrxcmd! del30 ttyS0
                        LINES=`cat $F|wc -l 2>/dev/null`
                        let SLEEP=30*LINES
                        echo "LINES=$LINES, SLEEP=$SLEEP"
                        rm $F
                        sleep $SLEEP
                else
                        if [[ -f $F ]]
                        then
                                rm $F
                        fi
                        ShowDate
                fi
        else
                echo "Led Matrix Scrolling Message Board device is not present."
                echo "Exiting."
        fi
        rm /tmp/ldmtrxmsgbrd_sem.tmp
else
        echo "Another instance is running."
fi

Note the time for the script to sleep while messages are being displayed/scrolled on led matrix device is now calculated from the number of lines that file actually contains multiplied by 30 seconds instead to be fixed at 900 seconds. There is still space to improvement here, since there may be long lines in the text (especially in weather forecasts) which may need more than 30 seconds to be fully scrolled across. Will be corrected in the next version.

Cosmetic changes to rssparse! as well as new parsing script htmlparse! were done:

Contents of rssparse! script:

#!/bin/bash

cat $1 | awk '

START
{
   pos=1;
   xml=$0
   len=length(xml);
   endp=1
}

{
   while(pos <= len)
   {
      if(substr(xml,pos,7) == "<title>")
      {
         pos=pos+7;
         endp=pos;
         while((substr(xml,endp,8) != "</title>") && (endp < len))
         {
            endp++;
         }
         print "   ",substr(xml,pos,endp-pos)," * ";
         pos=endp+7;
      }
      pos++;
   }
}'

Contents of htmlparse! script:

#!/bin/bash

cat $1 | awk '

START
{
   pos=1;
   xml=$0
   len=length(xml);
   str=""
}

{
   while(pos <= len)
   {
      if((pos <= len - 5) && (substr(xml,pos,6) == "<html>"))
      {
         pos=pos+5;
      }
      else if((pos <= len - 4) && (substr(xml,pos,5) == "<meta"))
      {
         pos=pos+5;
         while(substr(xml,pos,1) != ">")
         {
            pos++;
         } 
      }
      else if((pos <= len - 5) && (substr(xml,pos,6) == "<body>"))
      {
         pos=pos+5;
      }
      else if((pos <= len - 2) && (substr(xml,pos,3) == "<b>"))
      {
         pos=pos+2;
      }
      else if((pos <= len - 3) && (substr(xml,pos,4) == "</b>"))
      {
         pos=pos+3;
      }
      else if((pos <= len - 3) && (substr(xml,pos,4) == "<div"))
      {
         po=pos+4;
         while(substr(xml,pos,1) != ">")
         {
            pos++;
         } 
      }
      else if((pos <= len - 3) && (substr(xml,pos,4) == "<hr>"))
      {
         pos=pos+3;
         if(length(str) > 0)
         {
            str=str " * ";
            print str;
            str=""
         }
      }
      else if((pos <= len - 3) && (substr(xml,pos,4) == "<br>"))
      {
         pos=pos+3;
         if(length(str) > 0)
         {
            str=str " * ";
            print str;
            str=""
         }
      }
      else if((pos <= len - 5) && (substr(xml,pos,6) == "<form>"))
      {
         pos=pos+6;
         while((substr(xml,pos,7) != "</form>") && (pos <= len))
         {
            pos++;
         }
         pos=pos+6;
      }
      else if((pos <= len - 5) && substr(xml,pos,6) == "</div>")
      {
         pos=pos+5;
      }
      else if((pos <= len - 6) && substr(xml,pos,7) == "</body>")
      {
         pos=pos+6;
      }
      else if((pos <= len - 6) && substr(xml,pos,7) == "</html>")
      {
         pos=pos+6;
      }
      else
      {
         str=str substr(xml,pos,1);
      }
      pos++;
   }
}
END
{
}'

When shutting down the system, we need to make sure the cron server does not work before we switch led matrix device into standalone clock mode. Otherwise the lenghty backup process may last long enough for new instances of getnews! and ldmtrxmsgbrd! scripts to be triggered from crontab and alter the device mode. The new /opt/powerdown.sh script now looks like this:

#!/bin/sh
# Put system command to perform upon system shutdown
# Kill cron server
# Set LED Matrix Scrolling Message Board to the clock mode.
bash ~dsl/bin/pwdwn!
# automate system backups
cleanMyDSL.sh
if [ -s /opt/.backup_device -a ! -f /opt/.skip_backup ]; then filetool.sh backup noprompt; fi

Note the ability to bypass the backup step during shutdown was added by the means of manual creation and then detection of /opt/.skip_backup file in powerdown.sh script. That file does not have to be explicitly removed because persistent setup does not include this file, it will be gone after next boot up. Just make sure you have all changes backed up before using this  mechanism by running the tool manually if changes to the system setup/vital scripts were made. It'd be actually a good idea to schedule backup in crontab, only you need to make sure that any backup process that started is finished and it is not duplicated by running another one (create own wrapper script). Bypassing backup process at power down speeds up the shutdown/reboot process greatly in the DSL system configured like mine (boot from flash drive, running in RAM, persistence).

Contents of pwdwn! script:

#!/bin/bash
# Kill cron server.
CPID=`ps|grep MyCron|awk '{print $1;}' 2>/dev/null`
if [[ $CPID -gt 1 ]]
then
        sudo kill $CPID
fi
# Set LED Matrix Scrolling Message Board to the clock mode.
bash ~dsl/bin/inittty! /dev/ttyS0
bash ~dsl/bin/ldmtrxsettime! ttyS0

I did not realize when I started this project that it'd be so much fun. Perfecting the automated system so it can run unattended and deliver its function in reliable and efficient manner is just what is exciting in the job of any engineer.

Thanks for reading.

Marek

PS:

And now traditionally, some pictures:

Pic. 1: Weather news are not great, but not bad either :-)


Pic. 2: It only takes these many processes to run DSL as platform for my news headlines prompter and few extras.

Pic. 3: All runs from RAM.
Pic. 4: Sweet little micro ITX mobo, MSI fuzzy CX700D.

Pic. 5: No news is good news. And I know now that it is late... :-)