{\rtf1\ansi\ansicpg1252\cocoartf2578 \cocoatextscaling0\cocoaplatform0{\fonttbl\f0\fswiss\fcharset0 Helvetica-Bold;\f1\fswiss\fcharset0 Helvetica;} {\colortbl;\red255\green255\blue255;} {\*\expandedcolortbl;;} \margl1440\margr1440\vieww12820\viewh10560\viewkind0 \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\qc\partightenfactor0 \f0\b\fs28 \cf0 About "Validate External Links" \f1\b0 \ developed by Iritscen ({\field{\*\fldinst{HYPERLINK "http://iritscen.oni2.net"}}{\fldrslt http://iritscen.oni2.net}})\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0 \cf0 \ \ul Introduction\ulnone \ Validate External Links ("ValExtLinks" for short) is a Bash shell script for validating large numbers of external links on a wiki. It was developed on a Mac, but hopefully any Bash shell from v3.2 onward can run it. Non-Bash shells won't work, as there is a lot of Bash-specific syntax in this script. The script invokes Unix binaries that are mostly pretty standard, but you might want to make sure that you have "curl" and "expect" installed.\ \ The purpose of this read-me is not to tell you how to use the script. Running the script with the --help option should give you that information. This read-me intends to draw your attention to the following items that you will need to adapt to your system before you can use ValExtLinks, plus information for developing and testing it.\ \ \ul Execution\ulnone \ The following files are in the \f0\b main directory \f1\b0 , in reverse-alphabetical order:\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li956\fi-957\pardirnatural\partightenfactor0 \cf0 \'95 \f0\b validate_external_links.sh \f1\b0 is the main script.\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1662\fi-1664\pardirnatural\partightenfactor0 \cf0 \'95 The AGENT variable should contain a reasonably up-to-date user agent string, preferably generated from Google Chrome since that is the browser used to take screenshots of pages. There are different web sites you can visit in the browser to learn the user agent string, or you can simply upload and then visit the supplied print_user_agent.php (see Development folder).\ \'95 WIKI_CURL and WIKI_HTTP need to be set to the location of the pages that contain "curl" error codes and HTTP response codes, respectively. See the Documentation folder for copies of these pages.\ \'95 WIKI_MAIN should be set to the location of the main documentation.\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0 \cf0 \'95 \f0\b validate_external_links.command \f1\b0 is intended to be the means of running ValExtLinks. The idea is to make sure this file is executable and then to double-click it (or invoke it with "cron") when you want to run ValExtLinks.\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li1662\fi-1664\pardirnatural\partightenfactor0 \cf0 \'95 You can use the _LOCAL variables to supply the exact links and exception files that you want the script to process. Sample alternate invocations of ValExtLinks are also provided. Currently, the extlinks.csv file hosted on oni2.net refreshes twice a day, at 06:20 and 14:20 GMT.\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li956\fi-957\pardirnatural\partightenfactor0 \cf0 \'95 \f0\b val_expect_sftp.txt \f1\b0 is an "expect" script which performs the actual upload of the ValExtLinks report. ValExtLinks was written to use SFTP because Oni2.net does not support regular FTP.\ \'95 \f0\b sftp_login.txt \f1\b0 should be populated with your SFTP login info and the path that you want the report uploaded to. The ValExtLinks script uses this file when using val_expect_sftp.txt to invoke "sftp".\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0 \cf0 \ For your reference, the \f0\b Sample files \f1\b0 folder contains:\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li937\fi-938\pardirnatural\partightenfactor0 \cf0 \'95 \f0\b exceptions.txt \f1\b0 is a sample of how the exceptions list should look, formatted as a MediaWiki page. Anything before "BEGIN LIST" and after "END LIST" is ignored, so a local plain-text file that only contains those keywords and a list of exceptions should work just as well.\ \'95 \f0\b extlinks.csv \f1\b0 is a sample of oni2.net's external link table dump.\ \'95 \f0\b ValExtLinks report.htm/.rtf/.txt \f1\b0 is a sample of the output you should get from ValExtLinks.\ \ The \f0\b Documentation \f1\b0 folder contains:\ \'95 \f0\b curl codes.txt \f1\b0 lists the possible error codes that "curl" can return, formatted for MediaWiki.\ \'95 \f0\b HTTP codes.txt \f1\b0 lists the HTTP response codes that ValExtLinks understands, formatted for MediaWiki.\ \'95 \f0\b License.txt \f1\b0 is a standard copy of the MIT license, and applies to the whole project.\ \'95 \f0\b Read Me.rtf \f1\b0 is this file!\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0 \cf0 \ \ul Development\ulnone \ For testing and development, the \f0\b Development \f1\b0 folder contains:\ \'95 \f0\b Get script line count.command \f1\b0 tells you how big the script is.\ \'95 \f0\b print_user_agent.php \f1\b0 can be used for setting the AGENT variable, as mentioned above.\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\li950\fi-951\pardirnatural\partightenfactor0 \cf0 \'95 \f0\b Sample Archive response.txt \f1\b0 is a sample of the output of the Internet Archive's snapshot availability API.\ \'95 \f0\b Sample header - OK.txt \f1\b0 is a sample of what "curl" sees when it is run with the --head option and gets an OK (200) response.\ \'95 \f0\b Sample header - redirect.txt \f1\b0 is a sample of what "curl" sees when it is run with the --head option and gets a redirection (302) response.\ \'95 \f0\b ValExtLinks to-do.rtf \f1\b0 is the development to-do list.\ \'95 \f0\b YouTube - video_*.txt \f1\b0 are files with sample page source from videos that are NG, used to teach Val how to recognize bad YT links.\ \'95 \f0\b YouTube bad link detection.rtf \f1\b0 contains the links for the videos that the page source samples are from.\ \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0 \cf0 \ All right, that's all. Have fun fixing those external links!}