1: #!/usr/bin/perl
2: # The LearningOnline Network with CAPA
3: # lonc maintains the connections to remote computers
4: #
5: # $Id: loncnew,v 1.26 2003/09/30 11:11:17 foxr Exp $
6: #
7: # Copyright Michigan State University Board of Trustees
8: #
9: # This file is part of the LearningOnline Network with CAPA (LON-CAPA).
10: ## LON-CAPA is free software; you can redistribute it and/or modify
11: # it under the terms of the GNU General Public License as published by
12: # the Free Software Foundation; either version 2 of the License, or
13: # (at your option) any later version.
14: #
15: # LON-CAPA is distributed in the hope that it will be useful,
16: # but WITHOUT ANY WARRANTY; without even the implied warranty of
17: # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18: # GNU General Public License for more details.
19: #
20: # You should have received a copy of the GNU General Public License
21: # along with LON-CAPA; if not, write to the Free Software
22: # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
23: #
24: # /home/httpd/html/adm/gpl.txt
25: #
26: # http://www.lon-capa.org/
27: #
28: #
29: # new lonc handles n request out bver m connections to londs.
30: # This module is based on the Event class.
31: # Development iterations:
32: # - Setup basic event loop. (done)
33: # - Add timer dispatch. (done)
34: # - Add ability to accept lonc UNIX domain sockets. (done)
35: # - Add ability to create/negotiate lond connections (done).
36: # - Add general logic for dispatching requests and timeouts. (done).
37: # - Add support for the lonc/lond requests. (done).
38: # - Add logging/status monitoring.
39: # - Add Signal handling - HUP restarts. USR1 status report.
40: # - Add Configuration file I/O (done).
41: # - Add management/status request interface.
42: # - Add deferred request capability. (done)
43: # - Detect transmission timeouts.
44: #
45:
46: # Change log:
47: # $Log: loncnew,v $
48: # Revision 1.26 2003/09/30 11:11:17 foxr
49: # Add book-keeping hashes to support the re-init procedure.
50: #
51: # Revision 1.25 2003/09/23 11:22:14 foxr
52: # Tested ability to receive sigusr2 This is now logged and must be
53: # properly implemented as a re-read of hosts and re-init of appropriate
54: # children.
55: #
56: # Revision 1.24 2003/09/16 09:46:42 foxr
57: # Added skeletal infrastructure to support SIGUSR2 update hosts request.
58: #
59: # Revision 1.23 2003/09/15 09:24:49 foxr
60: # Add use strict and fix all the fallout from that.
61: #
62: # Revision 1.22 2003/09/02 10:34:47 foxr
63: # - Fix errors in host dead detection logic (too many cases where the
64: # retries left were not getting incremented or just not checked).
65: # - Added some additional status to the ps axuww display:
66: # o Remaining retries on a host.
67: # o >>> DEAD <<< indicator if I've given up on a host.
68: # - Tested the SIGHUP will reset the retries remaining count (thanks to
69: # the above status stuff, and get allow the loncnew to re-try again
70: # on the host (thanks to the log).
71: #
72: # Revision 1.21 2003/08/26 09:19:51 foxr
73: # How embarrassing... put in the SocketTimeout function in loncnew and forgot
74: # to actually hook it into the LondTransaction. Added this to MakeLondConnection
75: # where it belongs... hopefully transactions (not just connection attempts) will
76: # timeout more speedily than the socket errors will catch it.
77: #
78: # Revision 1.20 2003/08/25 18:48:11 albertel
79: # - fixing a forgotten ;
80: #
81: # Revision 1.19 2003/08/19 09:31:46 foxr
82: # Get socket directory from configuration rather than the old hard coded test
83: # way that I forgot to un-hard code.
84: #
85: # Revision 1.18 2003/08/06 09:52:29 foxr
86: # Also needed to remember to fail in-flight transactions if their sends fail.
87: #
88: # Revision 1.17 2003/08/03 00:44:31 foxr
89: # 1. Correct handling of connection failure: Assume it means the host is
90: # unreachable and fail all of the queued transactions. Note that the
91: # inflight transactions should fail on their own time due either to timeout
92: # or send/receive failures.
93: # 2. Correct handling of logs for forced death signals. Pull the signal
94: # from the event watcher.
95: #
96: # Revision 1.16 2003/07/29 02:33:05 foxr
97: # Add SIGINT processing to child processes to toggle annoying trace mode
98: # on/off.. will try to use this to isolate the compute boud process issue.
99: #
100: # Revision 1.15 2003/07/15 02:07:05 foxr
101: # Added code for lonc/lond transaction timeouts. Who knows if it works right.
102: # The intent is for a timeout to fail any transaction in progress and kill
103: # off the sockt that timed out.
104: #
105: # Revision 1.14 2003/07/03 02:10:18 foxr
106: # Get all of the signals to work correctly.
107: #
108: # Revision 1.13 2003/07/02 01:31:55 foxr
109: # Added kill -HUP logic (restart).
110: #
111: # Revision 1.11 2003/06/25 01:54:44 foxr
112: # Fix more problems with transaction failure.
113: #
114: # Revision 1.10 2003/06/24 02:46:04 foxr
115: # Put a limit on the number of times we'll retry a connection.
116: # Start getting the signal stuff put in as well...note that need to get signals
117: # going or else the client will permanently give up on dead servers.
118: #
119: # Revision 1.9 2003/06/13 02:38:43 foxr
120: # Add logging in 'expected format'
121: #
122: # Revision 1.8 2003/06/11 02:04:35 foxr
123: # Support delayed transactions... this is done uniformly by encapsulating
124: # transactions in an object ... a LondTransaction that is implemented by
125: # LondTransaction.pm
126: #
127: # Revision 1.7 2003/06/03 01:59:39 foxr
128: # complete coding to support deferred transactions.
129: #
130: #
131: use strict;
132: use lib "/home/httpd/lib/perl/";
133: use lib "/home/foxr/newloncapa/types";
134: use Event qw(:DEFAULT );
135: use POSIX qw(:signal_h);
136: use POSIX;
137: use IO::Socket;
138: use IO::Socket::INET;
139: use IO::Socket::UNIX;
140: use IO::File;
141: use IO::Handle;
142: use Socket;
143: use Crypt::IDEA;
144: use LONCAPA::Queue;
145: use LONCAPA::Stack;
146: use LONCAPA::LondConnection;
147: use LONCAPA::LondTransaction;
148: use LONCAPA::Configuration;
149: use LONCAPA::HashIterator;
150:
151:
152: #
153: # Disable all signals we might receive from outside for now.
154: #
155:
156:
157: # Read the httpd configuration file to get perl variables
158: # normally set in apache modules:
159:
160: my $perlvarref = LONCAPA::Configuration::read_conf('loncapa.conf');
161: my %perlvar = %{$perlvarref};
162:
163: #
164: # parent and shared variables.
165:
166: my %ChildHash; # by pid -> host.
167: my %HostToPid; # By host -> pid.
168: my %HostHash; # by loncapaname -> IP.
169:
170:
171: my $MaxConnectionCount = 10; # Will get from config later.
172: my $ClientConnection = 0; # Uniquifier for client events.
173:
174: my $DebugLevel = 0;
175: my $NextDebugLevel= 10; # So Sigint can toggle this.
176: my $IdleTimeout= 3600; # Wait an hour before pruning connections.
177:
178: #
179: # The variables below are only used by the child processes.
180: #
181: my $RemoteHost; # Name of host child is talking to.
182: my $UnixSocketDir= $perlvar{'lonSockDir'};
183: my $IdleConnections = Stack->new(); # Set of idle connections
184: my %ActiveConnections; # Connections to the remote lond.
185: my %ActiveTransactions; # LondTransactions in flight.
186: my %ActiveClients; # Serial numbers of active clients by socket.
187: my $WorkQueue = Queue->new(); # Queue of pending transactions.
188: my $ConnectionCount = 0;
189: my $IdleSeconds = 0; # Number of seconds idle.
190: my $Status = ""; # Current status string.
191: my $RecentLogEntry = "";
192: my $ConnectionRetries=5; # Number of connection retries allowed.
193: my $ConnectionRetriesLeft=5; # Number of connection retries remaining.
194:
195: #
196: # The hash below gives the HTML format for log messages
197: # given a severity.
198: #
199: my %LogFormats;
200:
201: $LogFormats{"CRITICAL"} = "<font color=red>CRITICAL: %s</font>";
202: $LogFormats{"SUCCESS"} = "<font color=green>SUCCESS: %s</font>";
203: $LogFormats{"INFO"} = "<font color=yellow>INFO: %s</font>";
204: $LogFormats{"WARNING"} = "<font color=blue>WARNING: %s</font>";
205: $LogFormats{"DEFAULT"} = " %s ";
206:
207:
208:
209: =pod
210:
211: =head2 LogPerm
212:
213: Makes an entry into the permanent log file.
214:
215: =cut
216: sub LogPerm {
217: my $message=shift;
218: my $execdir=$perlvar{'lonDaemons'};
219: my $now=time;
220: my $local=localtime($now);
221: my $fh=IO::File->new(">>$execdir/logs/lonnet.perm.log");
222: print $fh "$now:$message:$local\n";
223: }
224:
225: =pod
226:
227: =head2 Log
228:
229: Logs a message to the log file.
230: Parameters:
231:
232: =item severity
233:
234: One of CRITICAL, WARNING, INFO, SUCCESS used to select the
235: format string used to format the message. if the severity is
236: not a defined severity the Default format string is used.
237:
238: =item message
239:
240: The base message. In addtion to the format string, the message
241: will be appended to a string containing the name of our remote
242: host and the time will be formatted into the message.
243:
244: =cut
245:
246: sub Log {
247: my $severity = shift;
248: my $message = shift;
249:
250: if(!$LogFormats{$severity}) {
251: $severity = "DEFAULT";
252: }
253:
254: my $format = $LogFormats{$severity};
255:
256: # Put the window dressing in in front of the message format:
257:
258: my $now = time;
259: my $local = localtime($now);
260: my $finalformat = "$local ($$) [$RemoteHost] [$Status] ";
261: my $finalformat = $finalformat.$format."\n";
262:
263: # open the file and put the result.
264:
265: my $execdir = $perlvar{'lonDaemons'};
266: my $fh = IO::File->new(">>$execdir/logs/lonc.log");
267: my $msg = sprintf($finalformat, $message);
268: $RecentLogEntry = $msg;
269: print $fh $msg;
270:
271:
272: }
273:
274:
275: =pod
276:
277: =head2 GetPeerName
278:
279: Returns the name of the host that a socket object is connected to.
280:
281: =cut
282:
283: sub GetPeername {
284: my $connection = shift;
285: my $AdrFamily = shift;
286: my $peer = $connection->peername();
287: my $peerport;
288: my $peerip;
289: if($AdrFamily == AF_INET) {
290: ($peerport, $peerip) = sockaddr_in($peer);
291: my $peername = gethostbyaddr($peerip, $AdrFamily);
292: return $peername;
293: } elsif ($AdrFamily == AF_UNIX) {
294: my $peerfile;
295: ($peerfile) = sockaddr_un($peer);
296: return $peerfile;
297: }
298: }
299: #----------------------------- Timer management ------------------------
300: =pod
301:
302: =head2 Debug
303:
304: Invoked to issue a debug message.
305:
306: =cut
307:
308: sub Debug {
309: my $level = shift;
310: my $message = shift;
311: if ($level <= $DebugLevel) {
312: Log("INFO", "-Debug- $message host = $RemoteHost");
313: }
314: }
315:
316: sub SocketDump {
317: my $level = shift;
318: my $socket= shift;
319: if($level <= $DebugLevel) {
320: $socket->Dump();
321: }
322: }
323:
324: =pod
325:
326: =head2 ShowStatus
327:
328: Place some text as our pid status.
329: and as what we return in a SIGUSR1
330:
331: =cut
332: sub ShowStatus {
333: my $state = shift;
334: my $now = time;
335: my $local = localtime($now);
336: $Status = $local.": ".$state;
337: $0='lonc: '.$state.' '.$local;
338: }
339:
340: =pod
341:
342: =head 2 SocketTimeout
343:
344: Called when an action on the socket times out. The socket is
345: destroyed and any active transaction is failed.
346:
347:
348: =cut
349: sub SocketTimeout {
350: my $Socket = shift;
351:
352: KillSocket($Socket); # A transaction timeout also counts as
353: # a connection failure:
354: $ConnectionRetriesLeft--;
355: }
356:
357: =pod
358:
359: =head2 Tick
360:
361: Invoked each timer tick.
362:
363: =cut
364:
365:
366: sub Tick {
367: my $client;
368: if($ConnectionRetriesLeft > 0) {
369: ShowStatus(GetServerHost()." Connection count: ".$ConnectionCount
370: ." Retries remaining: ".$ConnectionRetriesLeft);
371: } else {
372: ShowStatus(GetServerHost()." >> DEAD <<");
373: }
374: # Is it time to prune connection count:
375:
376:
377: if($IdleConnections->Count() &&
378: ($WorkQueue->Count() == 0)) { # Idle connections and nothing to do?
379: $IdleSeconds++;
380: if($IdleSeconds > $IdleTimeout) { # Prune a connection...
381: my $Socket = $IdleConnections->pop();
382: KillSocket($Socket);
383: }
384: } else {
385: $IdleSeconds = 0; # Reset idle count if not idle.
386: }
387: #
388: # For each inflight transaction, tick down its timeout counter.
389: #
390: foreach my $item (keys %ActiveTransactions) {
391: my $Socket = $ActiveTransactions{$item}->getServer();
392: $Socket->Tick();
393: }
394: # Do we have work in the queue, but no connections to service them?
395: # If so, try to make some new connections to get things going again.
396: #
397:
398: my $Requests = $WorkQueue->Count();
399: if (($ConnectionCount == 0) && ($Requests > 0)) {
400: if ($ConnectionRetriesLeft > 0) {
401: my $Connections = ($Requests <= $MaxConnectionCount) ?
402: $Requests : $MaxConnectionCount;
403: Debug(1,"Work but no connections, start ".$Connections." of them");
404: my $successCount = 0;
405: for (my $i =0; $i < $Connections; $i++) {
406: $successCount += MakeLondConnection();
407: }
408: if($successCount == 0) { # All connections failed:
409: Debug(1,"Work in queue failed to make any connectiouns\n");
410: EmptyQueue(); # Fail pending transactions with con_lost.
411: }
412: } else {
413: ShowStatus(GetServerHost()." >>> DEAD!!! <<<");
414: Debug(1,"Work in queue, but gave up on connections..flushing\n");
415: EmptyQueue(); # Connections can't be established.
416: }
417:
418: }
419: }
420:
421: =pod
422:
423: =head2 SetupTimer
424:
425: Sets up a 1 per sec recurring timer event. The event handler is used to:
426:
427: =item
428:
429: Trigger timeouts on communications along active sockets.
430:
431: =item
432:
433: Trigger disconnections of idle sockets.
434:
435: =cut
436:
437: sub SetupTimer {
438: Debug(6, "SetupTimer");
439: Event->timer(interval => 1, debug => 1, cb => \&Tick );
440: }
441:
442: =pod
443:
444: =head2 ServerToIdle
445:
446: This function is called when a connection to the server is
447: ready for more work.
448:
449: If there is work in the Work queue the top element is dequeued
450: and the connection will start to work on it. If the work queue is
451: empty, the connection is pushed on the idle connection stack where
452: it will either get another work unit, or alternatively, if it sits there
453: long enough, it will be shut down and released.
454:
455: =cut
456:
457: sub ServerToIdle {
458: my $Socket = shift; # Get the socket.
459: delete($ActiveTransactions{$Socket}); # Server has no transaction
460:
461: &Debug(6, "Server to idle");
462:
463: # If there's work to do, start the transaction:
464:
465: my $reqdata = $WorkQueue->dequeue(); # This is a LondTransaction
466: unless($reqdata eq undef) {
467: Debug(9, "Queue gave request data: ".$reqdata->getRequest());
468: &StartRequest($Socket, $reqdata);
469:
470: } else {
471:
472: # There's no work waiting, so push the server to idle list.
473: &Debug(8, "No new work requests, server connection going idle");
474: $IdleConnections->push($Socket);
475: }
476: }
477:
478: =pod
479:
480: =head2 ClientWritable
481:
482: Event callback for when a client socket is writable.
483:
484: This callback is established when a transaction reponse is
485: avaiable from lond. The response is forwarded to the unix socket
486: as it becomes writable in this sub.
487:
488: Parameters:
489:
490: =item Event
491:
492: The event that has been triggered. Event->w->data is
493: the data and Event->w->fd is the socket to write.
494:
495: =cut
496:
497: sub ClientWritable {
498: my $Event = shift;
499: my $Watcher = $Event->w;
500: my $Data = $Watcher->data;
501: my $Socket = $Watcher->fd;
502:
503: # Try to send the data:
504:
505: &Debug(6, "ClientWritable writing".$Data);
506: &Debug(9, "Socket is: ".$Socket);
507:
508: if($Socket->connected) {
509: my $result = $Socket->send($Data, 0);
510:
511: # $result undefined: the write failed.
512: # otherwise $result is the number of bytes written.
513: # Remove that preceding string from the data.
514: # If the resulting data is empty, destroy the watcher
515: # and set up a read event handler to accept the next
516: # request.
517:
518: &Debug(9,"Send result is ".$result." Defined: ".defined($result));
519: if(defined($result)) {
520: &Debug(9, "send result was defined");
521: if($result == length($Data)) { # Entire string sent.
522: &Debug(9, "ClientWritable data all written");
523: $Watcher->cancel();
524: #
525: # Set up to read next request from socket:
526:
527: my $descr = sprintf("Connection to lonc client %d",
528: $ActiveClients{$Socket});
529: Event->io(cb => \&ClientRequest,
530: poll => 'r',
531: desc => $descr,
532: data => "",
533: fd => $Socket);
534:
535: } else { # Partial string sent.
536: $Watcher->data(substr($Data, $result));
537: if($result == 0) { # client hung up on us!!
538: Log("INFO", "lonc pipe client hung up on us!");
539: $Watcher->cancel;
540: $Socket->shutdown(2);
541: $Socket->close();
542: }
543: }
544:
545: } else { # Error of some sort...
546:
547: # Some errnos are possible:
548: my $errno = $!;
549: if($errno == POSIX::EWOULDBLOCK ||
550: $errno == POSIX::EAGAIN ||
551: $errno == POSIX::EINTR) {
552: # No action taken?
553: } else { # Unanticipated errno.
554: &Debug(5,"ClientWritable error or peer shutdown".$RemoteHost);
555: $Watcher->cancel; # Stop the watcher.
556: $Socket->shutdown(2); # Kill connection
557: $Socket->close(); # Close the socket.
558: }
559:
560: }
561: } else {
562: $Watcher->cancel(); # A delayed request...just cancel.
563: }
564: }
565:
566: =pod
567:
568: =head2 CompleteTransaction
569:
570: Called when the reply data has been received for a lond
571: transaction. The reply data must now be sent to the
572: ultimate client on the other end of the Unix socket. This is
573: done by setting up a writable event for the socket with the
574: data the reply data.
575:
576: Parameters:
577:
578: =item Socket
579:
580: Socket on which the lond transaction occured. This is a
581: LondConnection. The data received is in the TransactionReply member.
582:
583: =item Transaction
584:
585: The transaction that is being completed.
586:
587: =cut
588:
589: sub CompleteTransaction {
590: &Debug(6,"Complete transaction");
591: my $Socket = shift;
592: my $Transaction = shift;
593:
594: if (!$Transaction->isDeferred()) { # Normal transaction
595: my $data = $Socket->GetReply(); # Data to send.
596: StartClientReply($Transaction, $data);
597: } else { # Delete deferred transaction file.
598: Log("SUCCESS", "A delayed transaction was completed");
599: LogPerm("S:$Transaction->getClient() :".$Transaction->getRequest());
600: unlink $Transaction->getFile();
601: }
602: }
603: =pod
604: =head1 StartClientReply
605:
606: Initiates a reply to a client where the reply data is a parameter.
607:
608: =head2 parameters:
609:
610: =item Transaction
611:
612: The transaction for which we are responding to the client.
613:
614: =item data
615:
616: The data to send to apached client.
617:
618: =cut
619: sub StartClientReply {
620: my $Transaction = shift;
621: my $data = shift;
622:
623:
624: my $Client = $Transaction->getClient();
625:
626: &Debug(8," Reply was: ".$data);
627: my $Serial = $ActiveClients{$Client};
628: my $desc = sprintf("Connection to lonc client %d",
629:
630: $Serial);
631: Event->io(fd => $Client,
632: poll => "w",
633: desc => $desc,
634: cb => \&ClientWritable,
635: data => $data);
636: }
637: =pod
638: =head2 FailTransaction
639:
640: Finishes a transaction with failure because the associated lond socket
641: disconnected. There are two possibilities:
642: - The transaction is deferred: in which case we just quietly
643: delete the transaction since there is no client connection.
644: - The transaction is 'live' in which case we initiate the sending
645: of "con_lost" to the client.
646:
647: Deleting the transaction means killing it from the
648: %ActiveTransactions hash.
649:
650: Parameters:
651:
652: =item client
653:
654: The LondTransaction we are failing.
655:
656: =cut
657:
658: sub FailTransaction {
659: my $transaction = shift;
660: Log("WARNING", "Failing transaction ".$transaction->getRequest());
661: Debug(1, "Failing transaction: ".$transaction->getRequest());
662: if (!$transaction->isDeferred()) { # If the transaction is deferred we'll get to it.
663: my $client = $transaction->getClient();
664: Debug(1," Replying con_lost to ".$transaction->getRequest());
665: StartClientReply($transaction, "con_lost\n");
666: }
667: if($ConnectionRetriesLeft <= 0) {
668: Log("CRITICAL", "Host marked dead: ".GetServerHost());
669: }
670:
671: }
672:
673: =pod
674: =head1 EmptyQueue
675:
676: Fails all items in the work queue with con_lost.
677: Note that each item in the work queue is a transaction.
678:
679: =cut
680: sub EmptyQueue {
681: $ConnectionRetriesLeft--; # Counts as connection failure too.
682: while($WorkQueue->Count()) {
683: my $request = $WorkQueue->dequeue(); # This is a transaction
684: FailTransaction($request);
685: }
686: }
687:
688: =pod
689:
690: =head2 CloseAllLondConnections
691:
692: Close all connections open on lond prior to exit e.g.
693:
694: =cut
695: sub CloseAllLondConnections {
696: foreach my $Socket (keys %ActiveConnections) {
697: KillSocket($Socket);
698: }
699: }
700: =cut
701:
702: =pod
703:
704: =head2 KillSocket
705:
706: Destroys a socket. This function can be called either when a socket
707: has died of 'natural' causes or because a socket needs to be pruned due to
708: idleness. If the socket has died naturally, if there are no longer any
709: live connections a new connection is created (in case there are transactions
710: in the queue). If the socket has been pruned, it is never re-created.
711:
712: Parameters:
713:
714: =item Socket
715:
716: The socket to kill off.
717:
718: =item Restart
719:
720: nonzero if we are allowed to create a new connection.
721:
722:
723: =cut
724: sub KillSocket {
725: my $Socket = shift;
726:
727: Log("WARNING", "Shutting down a socket");
728: $Socket->Shutdown();
729:
730: # If the socket came from the active connection set,
731: # delete its transaction... note that FailTransaction should
732: # already have been called!!!
733: # otherwise it came from the idle set.
734: #
735:
736: if(exists($ActiveTransactions{$Socket})) {
737: delete ($ActiveTransactions{$Socket});
738: }
739: if(exists($ActiveConnections{$Socket})) {
740: delete($ActiveConnections{$Socket});
741: }
742: $ConnectionCount--;
743:
744: # If the connection count has gone to zero and there is work in the
745: # work queue, the work all gets failed with con_lost.
746: #
747: if($ConnectionCount == 0) {
748: EmptyQueue();
749: }
750: }
751:
752: =pod
753:
754: =head2 LondReadable
755:
756: This function is called whenever a lond connection
757: is readable. The action is state dependent:
758:
759: =head3 State=Initialized
760:
761: We''re waiting for the challenge, this is a no-op until the
762: state changes.
763:
764: =head3 State=Challenged
765:
766: The challenge has arrived we need to transition to Writable.
767: The connection must echo the challenge back.
768:
769: =head3 State=ChallengeReplied
770:
771: The challenge has been replied to. The we are receiveing the
772: 'ok' from the partner.
773:
774: =head3 State=RequestingKey
775:
776: The ok has been received and we need to send the request for
777: an encryption key. Transition to writable for that.
778:
779: =head3 State=ReceivingKey
780:
781: The the key has been requested, now we are reading the new key.
782:
783: =head3 State=Idle
784:
785: The encryption key has been negotiated or we have finished
786: reading data from the a transaction. If the callback data has
787: a client as well as the socket iformation, then we are
788: doing a transaction and the data received is relayed to the client
789: before the socket is put on the idle list.
790:
791: =head3 State=SendingRequest
792:
793: I do not think this state can be received here, but if it is,
794: the appropriate thing to do is to transition to writable, and send
795: the request.
796:
797: =head3 State=ReceivingReply
798:
799: We finished sending the request to the server and now transition
800: to readable to receive the reply.
801:
802: The parameter to this function are:
803:
804: The event. Implicit in this is the watcher and its data. The data
805: contains at least the lond connection object and, if a
806: transaction is in progress, the socket attached to the local client.
807:
808: =cut
809:
810: sub LondReadable {
811:
812: my $Event = shift;
813: my $Watcher = $Event->w;
814: my $Socket = $Watcher->data;
815: my $client = undef;
816:
817: &Debug(6,"LondReadable called state = ".$Socket->GetState());
818:
819:
820: my $State = $Socket->GetState(); # All action depends on the state.
821:
822: SocketDump(6, $Socket);
823: my $status = $Socket->Readable();
824:
825: &Debug(2, "Socket->Readable returned: $status");
826:
827: if($status != 0) {
828: # bad return from socket read. Currently this means that
829: # The socket has become disconnected. We fail the transaction.
830:
831: Log("WARNING",
832: "Lond connection lost.");
833: if(exists($ActiveTransactions{$Socket})) {
834: FailTransaction($ActiveTransactions{$Socket});
835: }
836: $Watcher->cancel();
837: KillSocket($Socket);
838: $ConnectionRetriesLeft--; # Counts as connection failure
839: return;
840: }
841: SocketDump(6,$Socket);
842:
843: $State = $Socket->GetState(); # Update in case of transition.
844: &Debug(6, "After read, state is ".$State);
845:
846: if($State eq "Initialized") {
847:
848:
849: } elsif ($State eq "ChallengeReceived") {
850: # The challenge must be echoed back; The state machine
851: # in the connection takes care of setting that up. Just
852: # need to transition to writable:
853:
854: $Watcher->cb(\&LondWritable);
855: $Watcher->poll("w");
856:
857: } elsif ($State eq "ChallengeReplied") {
858:
859:
860: } elsif ($State eq "RequestingKey") {
861: # The ok was received. Now we need to request the key
862: # That requires us to be writable:
863:
864: $Watcher->cb(\&LondWritable);
865: $Watcher->poll("w");
866:
867: } elsif ($State eq "ReceivingKey") {
868:
869: } elsif ($State eq "Idle") {
870: # If necessary, complete a transaction and then go into the
871: # idle queue.
872: # Note that a trasition to idle indicates a live lond
873: # on the other end so reset the connection retries.
874: #
875: $ConnectionRetriesLeft = $ConnectionRetries; # success resets the count
876: $Watcher->cancel();
877: if(exists($ActiveTransactions{$Socket})) {
878: Debug(8,"Completing transaction!!");
879: CompleteTransaction($Socket,
880: $ActiveTransactions{$Socket});
881: } else {
882: Log("SUCCESS", "Connection ".$ConnectionCount." to "
883: .$RemoteHost." now ready for action");
884: }
885: ServerToIdle($Socket); # Next work unit or idle.
886:
887: } elsif ($State eq "SendingRequest") {
888: # We need to be writable for this and probably don't belong
889: # here inthe first place.
890:
891: Deubg(6, "SendingRequest state encountered in readable");
892: $Watcher->poll("w");
893: $Watcher->cb(\&LondWritable);
894:
895: } elsif ($State eq "ReceivingReply") {
896:
897:
898: } else {
899: # Invalid state.
900: Debug(4, "Invalid state in LondReadable");
901: }
902: }
903:
904: =pod
905:
906: =head2 LondWritable
907:
908: This function is called whenever a lond connection
909: becomes writable while there is a writeable monitoring
910: event. The action taken is very state dependent:
911:
912: =head3 State = Connected
913:
914: The connection is in the process of sending the 'init' hailing to the
915: lond on the remote end. The connection object''s Writable member is
916: called. On error, ConnectionError is called to destroy the connection
917: and remove it from the ActiveConnections hash
918:
919: =head3 Initialized
920:
921: 'init' has been sent, writability monitoring is removed and
922: readability monitoring is started with LondReadable as the callback.
923:
924: =head3 ChallengeReceived
925:
926: The connection has received the who are you challenge from the remote
927: system, and is in the process of sending the challenge
928: response. Writable is called.
929:
930: =head3 ChallengeReplied
931:
932: The connection has replied to the initial challenge The we switch to
933: monitoring readability looking for the server to reply with 'ok'.
934:
935: =head3 RequestingKey
936:
937: The connection is in the process of requesting its encryption key.
938: Writable is called.
939:
940: =head3 ReceivingKey
941:
942: The connection has sent the request for a key. Switch to readability
943: monitoring to accept the key
944:
945: =head3 SendingRequest
946:
947: The connection is in the process of sending a request to the server.
948: This request is part of a client transaction. All the states until
949: now represent the client setup protocol. Writable is called.
950:
951: =head3 ReceivingReply
952:
953: The connection has sent a request. Now it must receive a reply.
954: Readability monitoring is requested.
955:
956: This function is an event handler and therefore receives as
957: a parameter the event that has fired. The data for the watcher
958: of this event is a reference to a list of one or two elements,
959: depending on state. The first (and possibly only) element is the
960: socket. The second (present only if a request is in progress)
961: is the socket on which to return a reply to the caller.
962:
963: =cut
964:
965: sub LondWritable {
966: my $Event = shift;
967: my $Watcher = $Event->w;
968: my $Socket = $Watcher->data;
969: my $State = $Socket->GetState();
970:
971: Debug(6,"LondWritable State = ".$State."\n");
972:
973:
974: # Figure out what to do depending on the state of the socket:
975:
976:
977:
978:
979: SocketDump(6,$Socket);
980:
981: if ($State eq "Connected") {
982:
983: if ($Socket->Writable() != 0) {
984: # The write resulted in an error.
985: # We'll treat this as if the socket got disconnected:
986: Log("WARNING", "Connection to ".$RemoteHost.
987: " has been disconnected");
988: FailTransaction($ActiveTransactions{$Socket});
989: $Watcher->cancel();
990: KillSocket($Socket);
991: return;
992: }
993: # "init" is being sent...
994:
995:
996: } elsif ($State eq "Initialized") {
997:
998: # Now that init was sent, we switch
999: # to watching for readability:
1000:
1001: $Watcher->cb(\&LondReadable);
1002: $Watcher->poll("r");
1003:
1004: } elsif ($State eq "ChallengeReceived") {
1005: # We received the challenge, now we
1006: # are echoing it back. This is a no-op,
1007: # we're waiting for the state to change
1008:
1009: if($Socket->Writable() != 0) {
1010:
1011: $Watcher->cancel();
1012: KillSocket($Socket);
1013: return;
1014: }
1015:
1016: } elsif ($State eq "ChallengeReplied") {
1017: # The echo was sent back, so we switch
1018: # to watching readability.
1019:
1020: $Watcher->cb(\&LondReadable);
1021: $Watcher->poll("r");
1022:
1023: } elsif ($State eq "RequestingKey") {
1024: # At this time we're requesting the key.
1025: # again, this is essentially a no-op.
1026: # we'll write the next chunk until the
1027: # state changes.
1028:
1029: if($Socket->Writable() != 0) {
1030: # Write resulted in an error.
1031:
1032: $Watcher->cancel();
1033: KillSocket($Socket);
1034: return;
1035:
1036: }
1037: } elsif ($State eq "ReceivingKey") {
1038: # Now we need to wait for the key
1039: # to come back from the peer:
1040:
1041: $Watcher->cb(\&LondReadable);
1042: $Watcher->poll("r");
1043:
1044: } elsif ($State eq "SendingRequest") {
1045: # At this time we are sending a request to the
1046: # peer... write the next chunk:
1047:
1048: if($Socket->Writable() != 0) {
1049:
1050: if(exists($ActiveTransactions{$Socket})) {
1051: Debug(3, "Lond connection lost, failing transactions");
1052: FailTransaction($ActiveTransactions{$Socket});
1053: }
1054: $Watcher->cancel();
1055: KillSocket($Socket);
1056: return;
1057:
1058: }
1059:
1060: } elsif ($State eq "ReceivingReply") {
1061: # The send has completed. Wait for the
1062: # data to come in for a reply.
1063: Debug(8,"Writable sent request/receiving reply");
1064: $Watcher->cb(\&LondReadable);
1065: $Watcher->poll("r");
1066:
1067: } else {
1068: # Control only passes here on an error:
1069: # the socket state does not match any
1070: # of the known states... so an error
1071: # must be logged.
1072:
1073: &Debug(4, "Invalid socket state ".$State."\n");
1074: }
1075:
1076: }
1077: =pod
1078:
1079: =cut
1080: sub QueueDelayed {
1081: Debug(3,"QueueDelayed called");
1082:
1083: my $path = "$perlvar{'lonSockDir'}/delayed";
1084:
1085: Debug(4, "Delayed path: ".$path);
1086: opendir(DIRHANDLE, $path);
1087:
1088: my @alldelayed = grep /\.$RemoteHost$/, readdir DIRHANDLE;
1089: closedir(DIRHANDLE);
1090: my $dfname;
1091: my $reqfile;
1092: foreach $dfname (sort @alldelayed) {
1093: $reqfile = "$path/$dfname";
1094: Debug(4, "queueing ".$reqfile);
1095: my $Handle = IO::File->new($reqfile);
1096: my $cmd = <$Handle>;
1097: chomp $cmd; # There may or may not be a newline...
1098: $cmd = $cmd."\n"; # now for sure there's exactly one newline.
1099: my $Transaction = LondTransaction->new($cmd);
1100: $Transaction->SetDeferred($reqfile);
1101: QueueTransaction($Transaction);
1102: }
1103:
1104: }
1105:
1106: =pod
1107:
1108: =head2 MakeLondConnection
1109:
1110: Create a new lond connection object, and start it towards its initial
1111: idleness. Once idle, it becomes elligible to receive transactions
1112: from the work queue. If the work queue is not empty when the
1113: connection is completed and becomes idle, it will dequeue an entry and
1114: start off on it.
1115:
1116: =cut
1117:
1118: sub MakeLondConnection {
1119: Debug(4,"MakeLondConnection to ".GetServerHost()." on port "
1120: .GetServerPort());
1121:
1122: my $Connection = LondConnection->new(&GetServerHost(),
1123: &GetServerPort());
1124:
1125: if($Connection == undef) { # Needs to be more robust later.
1126: Log("CRITICAL","Failed to make a connection with lond.");
1127: $ConnectionRetriesLeft--;
1128: return 0; # Failure.
1129: } else {
1130:
1131: # The connection needs to have writability
1132: # monitored in order to send the init sequence
1133: # that starts the whole authentication/key
1134: # exchange underway.
1135: #
1136: my $Socket = $Connection->GetSocket();
1137: if($Socket == undef) {
1138: die "did not get a socket from the connection";
1139: } else {
1140: &Debug(9,"MakeLondConnection got socket: ".$Socket);
1141: }
1142:
1143: $Connection->SetTimeoutCallback(\&SocketTimeout);
1144:
1145: my $event = Event->io(fd => $Socket,
1146: poll => 'w',
1147: cb => \&LondWritable,
1148: data => $Connection,
1149: desc => 'Connection to lond server');
1150: $ActiveConnections{$Connection} = $event;
1151:
1152: $ConnectionCount++;
1153: Debug(4, "Connection count = ".$ConnectionCount);
1154: if($ConnectionCount == 1) { # First Connection:
1155: QueueDelayed;
1156: }
1157: Log("SUCESS", "Created connection ".$ConnectionCount
1158: ." to host ".GetServerHost());
1159: return 1; # Return success.
1160: }
1161:
1162: }
1163:
1164: =pod
1165:
1166: =head2 StartRequest
1167:
1168: Starts a lond request going on a specified lond connection.
1169: parameters are:
1170:
1171: =item $Lond
1172:
1173: Connection to the lond that will send the transaction and receive the
1174: reply.
1175:
1176: =item $Client
1177:
1178: Connection to the client that is making this request We got the
1179: request from this socket, and when the request has been relayed to
1180: lond and we get a reply back from lond it will get sent to this
1181: socket.
1182:
1183: =item $Request
1184:
1185: The text of the request to send.
1186:
1187: =cut
1188:
1189: sub StartRequest {
1190: my $Lond = shift;
1191: my $Request = shift; # This is a LondTransaction.
1192:
1193: Debug(6, "StartRequest: ".$Request->getRequest());
1194:
1195: my $Socket = $Lond->GetSocket();
1196:
1197: $Request->Activate($Lond);
1198: $ActiveTransactions{$Lond} = $Request;
1199:
1200: $Lond->InitiateTransaction($Request->getRequest());
1201: my $event = Event->io(fd => $Socket,
1202: poll => "w",
1203: cb => \&LondWritable,
1204: data => $Lond,
1205: desc => "lond transaction connection");
1206: $ActiveConnections{$Lond} = $event;
1207: Debug(8," Start Request made watcher data with ".$event->data."\n");
1208: }
1209:
1210: =pod
1211:
1212: =head2 QueueTransaction
1213:
1214: If there is an idle lond connection, it is put to work doing this
1215: transaction. Otherwise, the transaction is placed in the work queue.
1216: If placed in the work queue and the maximum number of connections has
1217: not yet been created, a new connection will be started. Our goal is
1218: to eventually have a sufficient number of connections that the work
1219: queue will typically be empty. parameters are:
1220:
1221: =item Socket
1222:
1223: open on the lonc client.
1224:
1225: =item Request
1226:
1227: data to send to the lond.
1228:
1229: =cut
1230:
1231: sub QueueTransaction {
1232:
1233: my $requestData = shift; # This is a LondTransaction.
1234: my $cmd = $requestData->getRequest();
1235:
1236: Debug(6,"QueueTransaction: ".$cmd);
1237:
1238: my $LondSocket = $IdleConnections->pop();
1239: if(!defined $LondSocket) { # Need to queue request.
1240: Debug(8,"Must queue...");
1241: $WorkQueue->enqueue($requestData);
1242: if($ConnectionCount < $MaxConnectionCount) {
1243: if($ConnectionRetriesLeft > 0) {
1244: Debug(4,"Starting additional lond connection");
1245: if(MakeLondConnection() == 0) {
1246: EmptyQueue(); # Fail transactions, can't make connection.
1247: }
1248: } else {
1249: ShowStatus(GetServerHost()." >>> DEAD !!!! <<<");
1250: EmptyQueue(); # It's worse than that ... he's dead Jim.
1251: }
1252: }
1253: } else { # Can start the request:
1254: Debug(8,"Can start...");
1255: StartRequest($LondSocket, $requestData);
1256: }
1257: }
1258:
1259: #-------------------------- Lonc UNIX socket handling ---------------------
1260:
1261: =pod
1262:
1263: =head2 ClientRequest
1264: Callback that is called when data can be read from the UNIX domain
1265: socket connecting us with an apache server process.
1266:
1267: =cut
1268:
1269: sub ClientRequest {
1270: Debug(6, "ClientRequest");
1271: my $event = shift;
1272: my $watcher = $event->w;
1273: my $socket = $watcher->fd;
1274: my $data = $watcher->data;
1275: my $thisread;
1276:
1277: Debug(9, " Watcher named: ".$watcher->desc);
1278:
1279: my $rv = $socket->recv($thisread, POSIX::BUFSIZ, 0);
1280: Debug(8, "rcv: data length = ".length($thisread)
1281: ." read =".$thisread);
1282: unless (defined $rv && length($thisread)) {
1283: # Likely eof on socket.
1284: Debug(5,"Client Socket closed on lonc for ".$RemoteHost);
1285: close($socket);
1286: $watcher->cancel();
1287: delete($ActiveClients{$socket});
1288: return;
1289: }
1290: Debug(8,"Data: ".$data." this read: ".$thisread);
1291: $data = $data.$thisread; # Append new data.
1292: $watcher->data($data);
1293: if($data =~ /(.*\n)/) { # Request entirely read.
1294: if($data eq "close_connection_exit\n") {
1295: Log("CRITICAL",
1296: "Request Close Connection ... exiting");
1297: CloseAllLondConnections();
1298: exit;
1299: }
1300: Debug(8, "Complete transaction received: ".$data);
1301: my $Transaction = LondTransaction->new($data);
1302: $Transaction->SetClient($socket);
1303: QueueTransaction($Transaction);
1304: $watcher->cancel(); # Done looking for input data.
1305: }
1306:
1307: }
1308:
1309:
1310: =pod
1311:
1312: =head2 NewClient
1313:
1314: Callback that is called when a connection is received on the unix
1315: socket for a new client of lonc. The callback is parameterized by the
1316: event.. which is a-priori assumed to be an io event, and therefore has
1317: an fd member that is the Listener socket. We Accept the connection
1318: and register a new event on the readability of that socket:
1319:
1320: =cut
1321:
1322: sub NewClient {
1323: Debug(6, "NewClient");
1324: my $event = shift; # Get the event parameters.
1325: my $watcher = $event->w;
1326: my $socket = $watcher->fd; # Get the event' socket.
1327: my $connection = $socket->accept(); # Accept the client connection.
1328: Debug(8,"Connection request accepted from "
1329: .GetPeername($connection, AF_UNIX));
1330:
1331:
1332: my $description = sprintf("Connection to lonc client %d",
1333: $ClientConnection);
1334: Debug(9, "Creating event named: ".$description);
1335: Event->io(cb => \&ClientRequest,
1336: poll => 'r',
1337: desc => $description,
1338: data => "",
1339: fd => $connection);
1340: $ActiveClients{$connection} = $ClientConnection;
1341: $ClientConnection++;
1342: }
1343:
1344: =pod
1345:
1346: =head2 GetLoncSocketPath
1347:
1348: Returns the name of the UNIX socket on which to listen for client
1349: connections.
1350:
1351: =cut
1352:
1353: sub GetLoncSocketPath {
1354: return $UnixSocketDir."/".GetServerHost();
1355: }
1356:
1357: =pod
1358:
1359: =head2 GetServerHost
1360:
1361: Returns the host whose lond we talk with.
1362:
1363: =cut
1364:
1365: sub GetServerHost {
1366: return $RemoteHost; # Setup by the fork.
1367: }
1368:
1369: =pod
1370:
1371: =head2 GetServerPort
1372:
1373: Returns the lond port number.
1374:
1375: =cut
1376:
1377: sub GetServerPort {
1378: return $perlvar{londPort};
1379: }
1380:
1381: =pod
1382:
1383: =head2 SetupLoncListener
1384:
1385: Setup a lonc listener event. The event is called when the socket
1386: becomes readable.. that corresponds to the receipt of a new
1387: connection. The event handler established will accept the connection
1388: (creating a communcations channel), that int turn will establish
1389: another event handler to subess requests.
1390:
1391: =cut
1392:
1393: sub SetupLoncListener {
1394:
1395: my $socket;
1396: my $SocketName = GetLoncSocketPath();
1397: unlink($SocketName);
1398: unless ($socket =IO::Socket::UNIX->new(Local => $SocketName,
1399: Listen => 10,
1400: Type => SOCK_STREAM)) {
1401: die "Failed to create a lonc listner socket";
1402: }
1403: Event->io(cb => \&NewClient,
1404: poll => 'r',
1405: desc => 'Lonc listener Unix Socket',
1406: fd => $socket);
1407: }
1408:
1409: =pod
1410:
1411: =head2 ChildStatus
1412:
1413: Child USR1 signal handler to report the most recent status
1414: into the status file.
1415:
1416: We also use this to reset the retries count in order to allow the
1417: client to retry connections with a previously dead server.
1418: =cut
1419: sub ChildStatus {
1420: my $event = shift;
1421: my $watcher = $event->w;
1422:
1423: Debug(2, "Reporting child status because : ".$watcher->data);
1424: my $docdir = $perlvar{'lonDocRoot'};
1425: my $fh = IO::File->new(">>$docdir/lon-status/loncstatus.txt");
1426: print $fh $$."\t".$RemoteHost."\t".$Status."\t".
1427: $RecentLogEntry."\n";
1428: $ConnectionRetriesLeft = $ConnectionRetries;
1429: }
1430:
1431: =pod
1432:
1433: =head2 SignalledToDeath
1434:
1435: Called in response to a signal that causes a chid process to die.
1436:
1437: =cut
1438:
1439:
1440: sub SignalledToDeath {
1441: my $event = shift;
1442: my $watcher= $event->w;
1443:
1444: Debug(2,"Signalled to death! via ".$watcher->data);
1445: my ($signal) = $watcher->data;
1446: chomp($signal);
1447: Log("CRITICAL", "Abnormal exit. Child $$ for $RemoteHost "
1448: ."died through "."\"$signal\"");
1449: LogPerm("F:lonc: $$ on $RemoteHost signalled to death: "
1450: ."\"$signal\"");
1451: exit 0;
1452:
1453: }
1454:
1455: =head2 ToggleDebug
1456:
1457: This sub toggles trace debugging on and off.
1458:
1459: =cut
1460:
1461: sub ToggleDebug {
1462: my $Current = $DebugLevel;
1463: $DebugLevel = $NextDebugLevel;
1464: $NextDebugLevel = $Current;
1465:
1466: Log("SUCCESS", "New debugging level for $RemoteHost now $DebugLevel");
1467:
1468: }
1469:
1470: =head2 ChildProcess
1471:
1472: This sub implements a child process for a single lonc daemon.
1473:
1474: =cut
1475:
1476: sub ChildProcess {
1477:
1478:
1479: #
1480: # Signals must be handled by the Event framework...
1481: #
1482:
1483: Event->signal(signal => "QUIT",
1484: cb => \&SignalledToDeath,
1485: data => "QUIT");
1486: Event->signal(signal => "HUP",
1487: cb => \&ChildStatus,
1488: data => "HUP");
1489: Event->signal(signal => "USR1",
1490: cb => \&ChildStatus,
1491: data => "USR1");
1492: Event->signal(signal => "INT",
1493: cb => \&ToggleDebug,
1494: data => "INT");
1495:
1496: SetupTimer();
1497:
1498: SetupLoncListener();
1499:
1500: $Event::Debuglevel = $DebugLevel;
1501:
1502: Debug(9, "Making initial lond connection for ".$RemoteHost);
1503:
1504: # Setup the initial server connection:
1505:
1506: # &MakeLondConnection(); // let first work requirest do it.
1507:
1508:
1509: Debug(9,"Entering event loop");
1510: my $ret = Event::loop(); # Start the main event loop.
1511:
1512:
1513: die "Main event loop exited!!!";
1514: }
1515:
1516: # Create a new child for host passed in:
1517:
1518: sub CreateChild {
1519: my $sigset = POSIX::SigSet->new(SIGINT);
1520: sigprocmask(SIG_BLOCK, $sigset);
1521: my $host = shift;
1522: $RemoteHost = $host;
1523: Log("CRITICAL", "Forking server for ".$host);
1524: my $pid = fork;
1525: if($pid) { # Parent
1526: $RemoteHost = "Parent";
1527: $ChildHash{$pid} = $RemoteHost;
1528: $HostToPid{$host}= $pid;
1529: sigprocmask(SIG_UNBLOCK, $sigset);
1530:
1531: } else { # child.
1532: ShowStatus("Connected to ".$RemoteHost);
1533: $SIG{INT} = 'DEFAULT';
1534: sigprocmask(SIG_UNBLOCK, $sigset);
1535: ChildProcess; # Does not return.
1536: }
1537:
1538: }
1539: #
1540: # Parent process logic pass 1:
1541: # For each entry in the hosts table, we will
1542: # fork off an instance of ChildProcess to service the transactions
1543: # to that host. Each pid will be entered in a global hash
1544: # with the value of the key, the host.
1545: # The parent will then enter a loop to wait for process exits.
1546: # Each exit gets logged and the child gets restarted.
1547: #
1548:
1549: #
1550: # Fork and start in new session so hang-up isn't going to
1551: # happen without intent.
1552: #
1553:
1554:
1555:
1556:
1557:
1558:
1559: ShowStatus("Forming new session");
1560: my $childpid = fork;
1561: if ($childpid != 0) {
1562: sleep 4; # Give child a chacne to break to
1563: exit 0; # a new sesion.
1564: }
1565: #
1566: # Write my pid into the pid file so I can be located
1567: #
1568:
1569: ShowStatus("Parent writing pid file:");
1570: my $execdir = $perlvar{'lonDaemons'};
1571: open (PIDSAVE, ">$execdir/logs/lonc.pid");
1572: print PIDSAVE "$$\n";
1573: close(PIDSAVE);
1574:
1575:
1576:
1577: if (POSIX::setsid() < 0) {
1578: print "Could not create new session\n";
1579: exit -1;
1580: }
1581:
1582: ShowStatus("Forking node servers");
1583:
1584: Log("CRITICAL", "--------------- Starting children ---------------");
1585:
1586: my $HostIterator = LondConnection::GetHostIterator;
1587: while (! $HostIterator->end()) {
1588:
1589: my $hostentryref = $HostIterator->get();
1590: CreateChild($hostentryref->[0]);
1591: $HostHash{$hostentryref->[0]} = $hostentryref->[4];
1592: $HostIterator->next();
1593: }
1594: $RemoteHost = "Parent Server";
1595:
1596: # Maintain the population:
1597:
1598: ShowStatus("Parent keeping the flock");
1599:
1600: #
1601: # Set up parent signals:
1602: #
1603:
1604: $SIG{INT} = \&Terminate;
1605: $SIG{TERM} = \&Terminate;
1606: $SIG{HUP} = \&Restart;
1607: $SIG{USR1} = \&CheckKids;
1608: $SIG{USR2} = \&UpdateKids; # LonManage update request.
1609:
1610: while(1) {
1611: my $deadchild = wait();
1612: if(exists $ChildHash{$deadchild}) { # need to restart.
1613: my $deadhost = $ChildHash{$deadchild};
1614: delete($HostToPid{$deadhost});
1615: delete($ChildHash{$deadchild});
1616: Log("WARNING","Lost child pid= ".$deadchild.
1617: "Connected to host ".$deadhost);
1618: Log("INFO", "Restarting child procesing ".$deadhost);
1619: CreateChild($deadhost);
1620: }
1621: }
1622:
1623:
1624:
1625: =pod
1626:
1627: =head1 CheckKids
1628:
1629: Since kids do not die as easily in this implementation
1630: as the previous one, there is no need to restart the
1631: dead ones (all dead kids get restarted when they die!!)
1632: The only thing this function does is to pass USR1 to the
1633: kids so that they report their status.
1634:
1635: =cut
1636:
1637: sub CheckKids {
1638: Debug(2, "Checking status of children");
1639: my $docdir = $perlvar{'lonDocRoot'};
1640: my $fh = IO::File->new(">$docdir/lon-status/loncstatus.txt");
1641: my $now=time;
1642: my $local=localtime($now);
1643: print $fh "LONC status $local - parent $$ \n\n";
1644: foreach my $pid (keys %ChildHash) {
1645: Debug(2, "Sending USR1 -> $pid");
1646: kill 'USR1' => $pid; # Tell Child to report status.
1647: sleep 1; # Wait so file doesn't intermix.
1648: }
1649: }
1650:
1651: =pod
1652:
1653: =head1 UpdateKids
1654:
1655: parent's SIGUSR2 handler. This handler:
1656:
1657: =item
1658:
1659: Rereads the hosts file.
1660:
1661: =item
1662:
1663: Kills off (via sigint) children for hosts that have disappeared.
1664:
1665: =item
1666:
1667: HUP's children for hosts that already exist (this just forces a status display
1668: and resets the connection retry count for that host.
1669:
1670: =item
1671:
1672: Starts new children for hosts that have been added to the hosts.tab file since
1673: the start of the master program and maintains them.
1674:
1675: =cut
1676:
1677: sub UpdateKids {
1678: Log("INFO", "Updating connections via SIGUSR2");
1679: }
1680:
1681:
1682: =pod
1683:
1684: =head1 Restart
1685:
1686: Signal handler for HUP... all children are killed and
1687: we self restart. This is an el-cheapo way to re read
1688: the config file.
1689:
1690: =cut
1691:
1692: sub Restart {
1693: &KillThemAll; # First kill all the children.
1694: Log("CRITICAL", "Restarting");
1695: my $execdir = $perlvar{'lonDaemons'};
1696: unlink("$execdir/logs/lonc.pid");
1697: exec("$execdir/lonc");
1698: }
1699:
1700: =pod
1701:
1702: =head1 KillThemAll
1703:
1704: Signal handler that kills all children by sending them a
1705: SIGHUP. Responds to sigint and sigterm.
1706:
1707: =cut
1708:
1709: sub KillThemAll {
1710: Debug(2, "Kill them all!!");
1711: local($SIG{CHLD}) = 'IGNORE'; # Our children >will< die.
1712: foreach my $pid (keys %ChildHash) {
1713: my $serving = $ChildHash{$pid};
1714: Debug(2, "Killing lonc for $serving pid = $pid");
1715: ShowStatus("Killing lonc for $serving pid = $pid");
1716: Log("CRITICAL", "Killing lonc for $serving pid = $pid");
1717: kill 'QUIT' => $pid;
1718: delete($ChildHash{$pid});
1719: }
1720: my $execdir = $perlvar{'lonDaemons'};
1721: unlink("$execdir/logs/lonc.pid");
1722:
1723: }
1724:
1725: =pod
1726:
1727: =head1 Terminate
1728:
1729: Terminate the system.
1730:
1731: =cut
1732:
1733: sub Terminate {
1734: KillThemAll;
1735: Log("CRITICAL","Master process exiting");
1736: exit 0;
1737:
1738: }
1739: =pod
1740:
1741: =head1 Theory
1742:
1743: The event class is used to build this as a single process with an
1744: event driven model. The following events are handled:
1745:
1746: =item UNIX Socket connection Received
1747:
1748: =item Request data arrives on UNIX data transfer socket.
1749:
1750: =item lond connection becomes writable.
1751:
1752: =item timer fires at 1 second intervals.
1753:
1754: All sockets are run in non-blocking mode. Timeouts managed by the timer
1755: handler prevents hung connections.
1756:
1757: Key data structures:
1758:
1759: =item RequestQueue
1760:
1761: A queue of requests received from UNIX sockets that are
1762: waiting for a chance to be forwarded on a lond connection socket.
1763:
1764: =item ActiveConnections
1765:
1766: A hash of lond connections that have transactions in process that are
1767: available to be timed out.
1768:
1769: =item ActiveTransactions
1770:
1771: A hash indexed by lond connections that contain the client reply
1772: socket for each connection that has an active transaction on it.
1773:
1774: =item IdleConnections
1775:
1776: A hash of lond connections that have no work to do. These connections
1777: can be closed if they are idle for a long enough time.
1778:
1779: =cut
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>