Hi! I've patched the latest (that was 2.0.12) kernel release to allow swapping via NFS. Yes, I know, it is slow. But there is little sense in allowing the root file system on NFS and not supporting swapping via NFS, except maybe for administrative reasons. There is a patch for linux-1.3.15 contained in the etherboot-2.0 package written by Markus Gutschke (gutschk@math.uni-muenster.de) to add support for nfs-swapping. However, this patch doesn't apply without modification to the 2.0.** kernels as the nfs code has changed a lot. However, I used it as a starting point. Some notes on the implementation: * to prevent deadlocks because of insufficient memory during nfs-swapping, every task that is involved in nfs transfer has implicitly the GFP_NFS priority for memory allocation. This is achieved by a global flag, "int nfs_swap_active", which is incremented every time a task enteres rw_swap_page() and tries to swap via nfs, and is decremented when that finished it's nfs swapping. Together with a new flag in the task structure, i.e. "task->doing_nfs" this allows for the priority override for memory allocations. "current->doing_nfs" is set to one whenever a task performs nfs transfers. Thus the priority override takes precedence over the priority argument to get_free_pages() only when some task is currently swapping via nfs and if the task requiring memory actually is involved in nfs transfer at the moment. * The nfs code calls sometimes kmalloc() to allocate some pages. When it does so, this may lead to a call to try_to_swap_out(). Then it could happen the task would recursively call the nfs code to swap some pages out via nfs. That MUST be prohibited as this could lead to a loop that would use up the nfs request queues. Therefore, if "current->doing_nfs" is set to 1 then try_to_swap_out() skips swap files that are located on nfs mounted volumes. Maybe one could do this a bit more fine grained, but it seems to work. * The routine rw_swap_page() calls "nfs_proc_write()" and "nfs_proc_read()" directly, rather then calling "file->f_op->write()" etc. First, this reduces the overhead a little bit, and secondly, one has to fake root permissions when accessing the swap files (swap files really should be read and writable ONLY be the superuser) and I felt uncomfortable with doin it. However, it might be quite a good thing to use the f_ops. However, there is something more against using the fops: * asynchronous access to nfs mounted volumes, that contain active swap-files must be prohibited. I'm not quite sure why, but I think this is mainly because generic_file_read_ahead() uses up memory very quickly because the nfs code simply marks pages as not uptodate and returns when doing asynchronous reads. Also, the nfs request queues are filled very quickly when doing asychronous reads AND swapping on the same nfs volume. Of course, the reason is that there is an infinite loop somewhere, because one does async io, runs out of memory and recursively enteres the nfs code again calling try_to_swap_out(). Also, the task doing async IO via nfs is NOT locked via "task->doing_nfs", as it actually doesn't do any nfs tranfers, but leave this to the nfsiods. To come to an end, I prevent async IO on nfs volumes containing active swap files by adding a new field to the nfs server struct, namely "no_async" that is incremented at each call to sys_swapon() and decremented by sys_swapoff(). I've tested the thing on a quite paranoid configuration, only 4M of memory and all disk, include root fs, were located on nfs. I tried to break the system by overloading it, but it seems that the code works now. I started a couple of large tasks that were trying to eat up about 12M at the same time, but the system still didn't hang, though it took, of course, ages until all the programs were loaded. Also, I used a modificated version of the program "swapd" by Nick Holloway , that allocates swap files on the file-system as needed (I modified it to be able to lock itself in memory and to acquire real time priority, if somebody is interested) This means that the system survived repeated addon's and removals of swap space, as well as multiple nfs swap files. Warning: it is still possible to lock up the system due to memory fragmentation. Unluckily, the page alloc code can't guarantee the allocation of memory chunks of more than 1 page. This means that really save setups are only possible with the nfs rsize and wsize below 4096 byte, at least for the nfs volumes that contain swap files. Of course, the preformance is diminuished by a small rsize and wsize. Also, it might be necessary to adjust the freepages parameters in /proc/sys/vm/freepages, at least for systems with few physical memory. This might be necessary as the nfs code is allowed to allocate reserved pages to prevent deadlock. But when no reserved pages remain, one looses. /proc/sys/vm/freepages containes 3 numbers: min_free_pages free_pages_low free_pages_high Linux tried to set these values to "16 #somevalue #somevalue" for the 4Mb machine. I used a bit higher values. Hint: one can set these parameters with echo "30 50 80" > /proc/sys/vm/freepages when the proc fs is mounted rw. One last note: there is a bug in the kswapd code, that can cause the kswapd to sleep forever and never wake up again. This is because the kswapd timer as well as the kswapd daemon itself try to set the variable "kswapd_awake" if the daemon wakes up, which can lead to a deadlock for the kswapd daemon. I'll send mail on this bug to Linus separately (I'm quite sure that it almost never happens except of course in my testing environment that was so low with RAM). Have fun Claus P.s.: The patch is against plain linux-2.0.12 ######################################################################################### diff -u --recursive --new-file linux-2.0-orig/fs/Config.in linux-2.0-patched/fs/Config.in --- linux-2.0-orig/fs/Config.in Fri Jul 26 14:59:47 1996 +++ linux-2.0-patched/fs/Config.in Wed Aug 7 01:07:27 1996 @@ -21,6 +21,7 @@ if [ "$CONFIG_INET" = "y" ]; then tristate 'NFS filesystem support' CONFIG_NFS_FS if [ "$CONFIG_NFS_FS" = "y" ]; then + bool ' Allow swap files to be on NFS filesystems' CONFIG_SWAP_NFS bool ' Root file system on NFS' CONFIG_ROOT_NFS if [ "$CONFIG_ROOT_NFS" = "y" ]; then bool ' BOOTP support' CONFIG_RNFS_BOOTP diff -u --recursive --new-file linux-2.0-orig/fs/nfs/bio.c linux-2.0-patched/fs/nfs/bio.c --- linux-2.0-orig/fs/nfs/bio.c Fri Jul 26 15:01:25 1996 +++ linux-2.0-patched/fs/nfs/bio.c Sun Aug 11 20:23:22 1996 @@ -61,7 +61,7 @@ if (count < rsize) rsize = count; result = nfs_proc_read(NFS_SERVER(inode), NFS_FH(inode), - pos, rsize, buf, &fattr); + pos, rsize, buf, &fattr, 0); dprintk("nfs_proc_read(%s, (%x,%lx), %ld, %d, %p) = %d\n", NFS_SERVER(inode)->hostname, inode->i_dev, inode->i_ino, @@ -213,7 +213,7 @@ dprintk("NFS: nfs_readpage %08lx\n", page_address(page)); address = page_address(page); page->count++; - if (!PageError(page) && NFS_SERVER(inode)->rsize >= PAGE_SIZE) + if ( !NFS_SERVER(inode)->no_async && !PageError(page) && NFS_SERVER(inode)->rsize >= PAGE_SIZE) error = do_read_nfs_async(inode, page); if (error < 0) /* couldn't enqueue */ error = do_read_nfs_sync(inode, page); diff -u --recursive --new-file linux-2.0-orig/fs/nfs/file.c linux-2.0-patched/fs/nfs/file.c --- linux-2.0-orig/fs/nfs/file.c Fri Jul 26 15:03:14 1996 +++ linux-2.0-patched/fs/nfs/file.c Thu Aug 8 13:53:17 1996 @@ -134,8 +134,7 @@ int hunk = count - written; if (hunk >= wsize) hunk = wsize; - result = nfs_proc_write(inode, - pos, hunk, buf, &fattr); + result = nfs_proc_write(inode, pos, hunk, buf, &fattr, 0); if (result < 0) break; pos += hunk; diff -u --recursive --new-file linux-2.0-orig/fs/nfs/inode.c linux-2.0-patched/fs/nfs/inode.c --- linux-2.0-orig/fs/nfs/inode.c Fri Jul 26 14:47:51 1996 +++ linux-2.0-patched/fs/nfs/inode.c Mon Aug 12 01:25:27 1996 @@ -143,7 +143,7 @@ sb->s_op = &nfs_sops; server = &sb->u.nfs_sb.s_server; server->file = filp; - server->lock = 0; + server->no_async = 0; server->wait = NULL; server->flags = data->flags; server->rsize = data->rsize; @@ -343,6 +343,9 @@ exit_mm(current); current->session = 1; current->pgrp = 1; +#ifndef MODULE + current->blocked = ~0UL; +#endif sprintf(current->comm, "nfsiod"); ret = nfsiod(); MOD_DEC_USE_COUNT; diff -u --recursive --new-file linux-2.0-orig/fs/nfs/proc.c linux-2.0-patched/fs/nfs/proc.c --- linux-2.0-orig/fs/nfs/proc.c Fri Jul 26 15:01:25 1996 +++ linux-2.0-patched/fs/nfs/proc.c Sun Aug 11 16:52:05 1996 @@ -43,6 +43,7 @@ #include #include #include +#include #include @@ -142,13 +143,16 @@ } -static inline int *xdr_encode_data(int *p, const char *data, int len) +static inline int *xdr_encode_data(int *p, const char *data, int len, int swap) { int quadlen = QUADLEN(len); p[quadlen] = 0; *p++ = htonl(len); - memcpy_fromfs(p, data, len); + if (!swap) + memcpy_fromfs(p, data, len); + else + memcpy(p, data, len); return p + quadlen; } @@ -371,7 +375,8 @@ } int nfs_proc_read(struct nfs_server *server, struct nfs_fh *fhandle, - int offset, int count, char *data, struct nfs_fattr *fattr) + int offset, int count, char *data, struct nfs_fattr *fattr, + int swap) { int *p, *p0; int status; @@ -382,7 +387,11 @@ if (!(p0 = nfs_rpc_alloc(server->rsize))) return -EIO; retry: - p = nfs_rpc_header(p0, NFSPROC_READ, ruid); + if (swap) + p= rpc_header(p0, NFSPROC_READ, NFS_PROGRAM, NFS_VERSION, + init_task.uid, init_task.egid, init_task.groups); + else + p = nfs_rpc_header(p0, NFSPROC_READ, ruid); p = xdr_encode_fhandle(p, fhandle); *p++ = htonl(offset); *p++ = htonl(count); @@ -492,7 +501,8 @@ } int nfs_proc_write(struct inode * inode, int offset, - int count, const char *data, struct nfs_fattr *fattr) + int count, const char *data, struct nfs_fattr *fattr, + int swap) { int *p, *p0; int status; @@ -505,15 +515,20 @@ if (!(p0 = nfs_rpc_alloc(server->wsize))) return -EIO; retry: - p = nfs_rpc_header(p0, NFSPROC_WRITE, ruid); + if (swap) + p= rpc_header(p0, NFSPROC_WRITE, NFS_PROGRAM, NFS_VERSION, + init_task.uid, init_task.egid, init_task.groups); + else + p = nfs_rpc_header(p0, NFSPROC_WRITE, ruid); p = xdr_encode_fhandle(p, fhandle); *p++ = htonl(offset); /* traditional, could be any value */ *p++ = htonl(offset); *p++ = htonl(count); /* traditional, could be any value */ kdata = (void *) (p+1); /* start of data in RPC buffer */ - p = xdr_encode_data(p, data, count); + p = xdr_encode_data(p, data, count, swap ); if ((status = nfs_rpc_call(server, p0, p, server->wsize)) < 0) { nfs_rpc_free(p0); + printk("nfs_proc_write: status: %d\n", status ); return status; } if (!(p = nfs_rpc_verify(p0))) diff -u --recursive --new-file linux-2.0-orig/fs/nfs/rpcsock.c linux-2.0-patched/fs/nfs/rpcsock.c --- linux-2.0-orig/fs/nfs/rpcsock.c Fri Jul 26 14:59:47 1996 +++ linux-2.0-patched/fs/nfs/rpcsock.c Sun Aug 11 16:42:37 1996 @@ -66,7 +66,10 @@ static inline void rpc_insque(struct rpc_sock *rsock, struct rpc_wait *slot) { - struct rpc_wait *next = rsock->pending; + struct rpc_wait *next; + + cli(); + next = rsock->pending; slot->w_next = next; slot->w_prev = NULL; @@ -75,6 +78,7 @@ rsock->pending = slot; slot->w_queued = 1; + sti(); dprintk("RPC: inserted %p into queue\n", slot); } @@ -84,9 +88,11 @@ static inline void rpc_remque(struct rpc_sock *rsock, struct rpc_wait *slot) { - struct rpc_wait *prev = slot->w_prev, - *next = slot->w_next; + struct rpc_wait *prev, *next; + cli(); + prev = slot->w_prev, + next = slot->w_next; if (prev != NULL) prev->w_next = next; else @@ -95,6 +101,8 @@ next->w_prev = prev; slot->w_queued = 0; + sti(); + dprintk("RPC: removed %p from queue, head now %p.\n", slot, rsock->pending); } @@ -195,7 +203,9 @@ req->rq_slot = NULL; - while (!(slot = rsock->free) || rsock->cong >= rsock->cwnd) { + cli(); + while (!(slot = rsock->free) || rsock->cong >= rsock->cwnd ) { + sti(); if (nocwait) { current->timeout = 0; return -ENOBUFS; @@ -208,6 +218,7 @@ return -ERESTARTSYS; if (rsock->shutdown) return -EIO; + cli(); } rsock->free = slot->w_next; @@ -219,6 +230,7 @@ dprintk("RPC: reserved slot %p\n", slot); req->rq_slot = slot; + sti(); return 0; } @@ -237,6 +249,7 @@ if (slot == rsock->pending && slot->w_next != NULL) wake_up(&slot->w_next->w_wait); + cli(); /* remove slot from queue of pending */ if (slot->w_queued) rpc_remque(rsock, slot); @@ -245,6 +258,8 @@ /* decrease congestion value */ rsock->cong -= RPC_CWNDSCALE; + sti(); + if (rsock->cong < rsock->cwnd && rsock->backlog) wake_up(&rsock->backlog); if (rsock->shutdown) diff -u --recursive --new-file linux-2.0-orig/fs/nfs/sock.c linux-2.0-patched/fs/nfs/sock.c --- linux-2.0-orig/fs/nfs/sock.c Fri Jul 26 14:40:24 1996 +++ linux-2.0-patched/fs/nfs/sock.c Sun Aug 11 16:45:46 1996 @@ -87,8 +87,21 @@ maxtimeo = timeout.to_maxval; do { +#ifdef CONFIG_SWAP_NFS +#ifdef DEBUG_SWAP_NFS + if ( current->doing_nfs ) { + printk( LOG_WARN"Dangerous: current %s already " + "doing nfs. This may lock your system!\n", + current->comm ); + } +#endif + current->doing_nfs = 1; +#endif result = rpc_doio(server->rsock, req, &timeout, async); rpc_release(server->rsock, req); /* Release slot */ +#ifdef CONFIG_SWAP_NFS + current->doing_nfs = 0; +#endif if (current->signal & ~current->blocked) result = -ERESTARTSYS; diff -u --recursive --new-file linux-2.0-orig/include/linux/nfs_fs.h linux-2.0-patched/include/linux/nfs_fs.h --- linux-2.0-orig/include/linux/nfs_fs.h Fri Jul 26 15:03:14 1996 +++ linux-2.0-patched/include/linux/nfs_fs.h Thu Aug 8 13:28:40 1996 @@ -73,9 +73,10 @@ unsigned int maxlen); extern int nfs_proc_read(struct nfs_server *server, struct nfs_fh *fhandle, int offset, int count, char *data, - struct nfs_fattr *fattr); + struct nfs_fattr *fattr, int swap); extern int nfs_proc_write(struct inode * inode, int offset, - int count, const char *data, struct nfs_fattr *fattr); + int count, const char *data, struct nfs_fattr *fattr, + int swap); extern int nfs_proc_create(struct nfs_server *server, struct nfs_fh *dir, const char *name, struct nfs_sattr *sattr, struct nfs_fh *fhandle, struct nfs_fattr *fattr); diff -u --recursive --new-file linux-2.0-orig/include/linux/nfs_fs_sb.h linux-2.0-patched/include/linux/nfs_fs_sb.h --- linux-2.0-orig/include/linux/nfs_fs_sb.h Fri Jul 26 14:22:14 1996 +++ linux-2.0-patched/include/linux/nfs_fs_sb.h Sun Aug 11 19:19:38 1996 @@ -8,7 +8,7 @@ struct file *file; struct rpc_sock *rsock; struct sockaddr toaddr ; /* Added for change to NFS code to use sendto() 1995-06-02 JSP */ - int lock; + int no_async; struct wait_queue *wait; int flags; int rsize; diff -u --recursive --new-file linux-2.0-orig/include/linux/sched.h linux-2.0-patched/include/linux/sched.h --- linux-2.0-orig/include/linux/sched.h Fri Jul 26 15:01:57 1996 +++ linux-2.0-patched/include/linux/sched.h Sun Aug 11 19:19:48 1996 @@ -215,6 +215,10 @@ /* mm fault and swap info: this can arguably be seen as either mm-specific or thread-specific */ unsigned long min_flt, maj_flt, nswap, cmin_flt, cmaj_flt, cnswap; int swappable:1; + int doing_nfs:1; /* this is set whenever the task does nfs, + * let's it allocate more memory to prevent deadlocks + * when swapping to nfs. + */ unsigned long swap_address; unsigned long old_maj_flt; /* old value of maj_flt */ unsigned long dec_flt; /* page fault count of the last time */ @@ -294,7 +298,7 @@ /* timer */ { NULL, NULL, 0, 0, it_real_fn }, \ /* utime */ 0,0,0,0,0, \ /* flt */ 0,0,0,0,0,0, \ -/* swp */ 0,0,0,0,0, \ +/* swp */ 0,0,0,0,0,0, \ /* rlimits */ INIT_RLIMITS, \ /* math */ 0, \ /* comm */ "swapper", \ diff -u --recursive --new-file linux-2.0-orig/include/linux/swap.h linux-2.0-patched/include/linux/swap.h --- linux-2.0-orig/include/linux/swap.h Fri Jul 26 14:59:15 1996 +++ linux-2.0-patched/include/linux/swap.h Wed Aug 7 00:50:36 1996 @@ -11,8 +11,9 @@ #include -#define SWP_USED 1 -#define SWP_WRITEOK 3 +#define SWP_USED 0x01 +#define SWP_WRITEOK 0x03 +#define SWP_ISNFS 0x04 #define SWAP_CLUSTER_MAX 32 diff -u --recursive --new-file linux-2.0-orig/mm/page_alloc.c linux-2.0-patched/mm/page_alloc.c --- linux-2.0-orig/mm/page_alloc.c Mon Aug 12 10:42:26 1996 +++ linux-2.0-patched/mm/page_alloc.c Sun Aug 11 16:20:35 1996 @@ -27,6 +27,16 @@ int nr_swap_pages = 0; int nr_free_pages = 0; +/* + * This is set in linux/mm/page_io.c and is necessary to avoid dead locks + * when using a NFS filesystem. It overrides the priority field of kmalloc + * when requesting memory through any of the network modules during NFS + * operations. + */ +#ifdef CONFIG_SWAP_NFS +extern int nfs_swap_active; +#endif + /* * Free area management * @@ -202,9 +212,18 @@ priority = GFP_ATOMIC; } } - reserved_pages = 5; - if (priority != GFP_NFS) - reserved_pages = min_free_pages; + reserved_pages = min_free_pages; +#ifdef CONFIG_SWAP_NFS +#undef DEBUG_SWAP_NFS_RESERVED_PAGES_HACK +#ifdef DEBUG_SWAP_NFS_RESERVED_PAGES_HACK + if ( (priority != GFP_NFS) && (current->doing_nfs && nfs_swap_active && (nr_free_pages < reserved_pages)) ) + printk("nfs_pid effective, reserved: %d, free: %d!\n", reserved_pages, nr_free_pages); +#endif + if ( (priority == GFP_NFS) || (current->doing_nfs && nfs_swap_active) ) +#else + if (priority == GFP_NFS) +#endif + reserved_pages = 5; save_flags(flags); repeat: cli(); diff -u --recursive --new-file linux-2.0-orig/mm/page_io.c linux-2.0-patched/mm/page_io.c --- linux-2.0-orig/mm/page_io.c Fri Jul 26 14:59:15 1996 +++ linux-2.0-patched/mm/page_io.c Sun Aug 11 16:28:49 1996 @@ -8,6 +8,7 @@ * Removed race in async swapping. 14.4.1996. Bruno Haible */ +#include #include #include #include @@ -20,6 +21,12 @@ #include #include #include +#ifdef CONFIG_SWAP_NFS +#include +#include +#include +#include +#endif #include #include /* for cli()/sti() */ @@ -27,8 +34,15 @@ #include #include +#ifdef CONFIG_SWAP_NFS +/* This is incremented each swapping via nfs is attempted, and decremented + * after having finished the nfs swap. It overrides (together with + * current->doing_nfs the priority field in __get_free_pages() to prevent + * deadlocks when swapping via nfs. + */ +int nfs_swap_active = 0; +#endif static struct wait_queue * lock_queue = NULL; - /* * Reads or writes a swap page. * wait=1: start I/O and wait for completion. wait=0: start asynchronous I/O. @@ -99,6 +113,47 @@ struct inode *swapf = p->swap_file; unsigned int zones[PAGE_SIZE/512]; int i; +#ifdef CONFIG_SWAP_NFS + if ((p->flags & SWP_ISNFS) == SWP_ISNFS) { + int j; + struct nfs_fattr fattr; + unsigned int block = offset * PAGE_SIZE; + + nfs_swap_active++; + + if (rw == READ) { + int n = NFS_SERVER(swapf)->rsize; + if (n > PAGE_SIZE) + n = PAGE_SIZE; + for (i = 0; i < PAGE_SIZE; i += n) { + j = nfs_proc_read(NFS_SERVER(swapf), + NFS_FH(swapf), block, n, buf, + &fattr, 1); + if (j < n) break; + block += n; + buf += n; + } + } else { + int n = NFS_SERVER(swapf)->wsize; + if (n > PAGE_SIZE) + n = PAGE_SIZE; + for (i = 0; i < PAGE_SIZE; i += n) { + j = nfs_proc_write(swapf, block, n, buf, + &fattr, 1); + if (j < 0) break; + block += n; + buf += n; + } + } + nfs_refresh_inode(swapf, &fattr); + + nfs_swap_active--; + + if (i < PAGE_SIZE || j < 0) + printk("rw_swap_page: NFS swap file error %d, offset %u (%s)\n", + j, block, (rw == READ) ? "reading" : "writing"); + } else +#endif if (swapf->i_op->bmap == NULL && swapf->i_op->smap != NULL){ /* @@ -122,6 +177,7 @@ return; } } + ll_rw_swap_file(rw,swapf->i_dev, zones, i,buf); }else{ int j; unsigned int block = offset @@ -131,8 +187,8 @@ if (!(zones[i] = bmap(swapf,block++))) { printk("rw_swap_page: bad swap file\n"); } + ll_rw_swap_file(rw,swapf->i_dev, zones, i,buf); } - ll_rw_swap_file(rw,swapf->i_dev, zones, i,buf); } else printk("rw_swap_page: no swap file or device\n"); atomic_dec(&page->count); diff -u --recursive --new-file linux-2.0-orig/mm/swapfile.c linux-2.0-patched/mm/swapfile.c --- linux-2.0-orig/mm/swapfile.c Fri Jul 26 15:01:32 1996 +++ linux-2.0-patched/mm/swapfile.c Sun Aug 11 19:50:41 1996 @@ -17,6 +17,12 @@ #include #include #include /* for blk_size */ +#ifdef CONFIG_SWAP_NFS +#include +#include +#include +#include +#endif #include #include /* for cli()/sti() */ @@ -354,7 +360,8 @@ /* just pick something that's safe... */ swap_list.next = swap_list.head; } - p->flags = SWP_USED; + p->flags &= ~SWP_WRITEOK; + p->flags |= SWP_USED; err = try_to_unuse(type); if (err) { iput(inode); @@ -367,7 +374,7 @@ swap_list.head = swap_list.next = p - swap_info; else swap_info[prev].next = p - swap_info; - p->flags = SWP_WRITEOK; + p->flags |= SWP_WRITEOK; return err; } if(p->swap_device){ @@ -381,6 +388,12 @@ filp.f_op->release(inode,&filp); } } +#ifdef CONFIG_SWAP_NFS + else if ((p->flags & SWP_ISNFS) == SWP_ISNFS) { + NFS_SERVER(p->swap_file)->no_async--; /* re-enable async reads + */ + } +#endif iput(inode); nr_swap_pages -= p->pages; @@ -471,12 +484,25 @@ } } else if (!S_ISREG(swap_inode->i_mode)) goto bad_swap; +#ifdef CONFIG_SWAP_NFS + else if (swap_inode->i_sb && + swap_inode->i_sb->s_type && + swap_inode->i_sb->s_type->name && + !strcmp(swap_inode->i_sb->s_type->name, "nfs")) { + p->flags |= SWP_ISNFS; + NFS_SERVER(swap_inode)->no_async++; /* async reads really + * doesn't work + */ + } +#endif p->swap_lockmap = (unsigned char *) get_free_page(GFP_USER); if (!p->swap_lockmap) { printk("Unable to start swapping: out of memory :-)\n"); error = -ENOMEM; goto bad_swap; } + + read_swap_page(SWP_ENTRY(type,0), (char *) p->swap_lockmap); if (memcmp("SWAP-SPACE",p->swap_lockmap+PAGE_SIZE-10,10)) { printk("Unable to find swap-space signature\n"); @@ -514,7 +540,7 @@ } p->swap_map[0] = 0x80; memset(p->swap_lockmap,0,PAGE_SIZE); - p->flags = SWP_WRITEOK; + p->flags |= SWP_WRITEOK; p->pages = j; nr_swap_pages += j; printk("Adding Swap: %dk swap-space\n",j<<(PAGE_SHIFT-10)); diff -u --recursive --new-file linux-2.0-orig/mm/vmscan.c linux-2.0-patched/mm/vmscan.c --- linux-2.0-orig/mm/vmscan.c Fri Jul 26 15:02:42 1996 +++ linux-2.0-patched/mm/vmscan.c Sun Aug 11 16:25:33 1996 @@ -110,6 +110,17 @@ return 0; if (!(entry = get_swap_page())) return -1; /* Aieee!!! Out of swap space! */ +#ifdef CONFIG_SWAP_NFS + /* This must be checked. kmalloc calls sometimes try_to_free_page() + * and kmalloc is called from inside the nfs code. + */ + if ( current->doing_nfs && + (swap_info[SWP_TYPE(entry)].flags & SWP_ISNFS) == SWP_ISNFS ) { + /* avoid loops when using kmalloc during nfs swapping + */ + return 0; + } +#endif vma->vm_mm->rss--; flush_cache_page(vma, address); set_pte(page_table, __pte(entry)); @@ -428,7 +439,6 @@ if (!kswapd_awake && kswapd_ctl.maxpages > 0) { wake_up(&kswapd_wait); need_resched = 1; - kswapd_awake = 1; } next_swap_jiffies = jiffies + swapout_interval; }