'before-ckrm_E17-mem'.
- freeing pages from the inactive list (shrink_zone)
depending on the recent usage of the page(approximately).
-In the process of the life cycle a page can move from the lru list to swap
-and back. For this document's purpose, we treat it same as freeing and
-allocating the page, respectfully.
-
1. Introduction
---------------
Memory resource controller controls the number of lru physical pages
9 = /dev/urandom Faster, less secure random number gen.
10 = /dev/aio Asyncronous I/O notification interface
11 = /dev/kmsg Writes to this come out as printk's
- 12 = /dev/oldmem Access to kexec-ed crash dump
1 block RAM disk
0 = /dev/ram0 First RAM disk
1 = /dev/ram1 Second RAM disk
+++ /dev/null
-Documentation for kdump - the kexec based crash dumping solution
-================================================================
-
-DESIGN
-======
-
-We use kexec to reboot to a second kernel whenever a dump needs to be taken.
-This second kernel is booted with with very little memory (configurable
-at compile time). The first kernel reserves the section of memory that the
-second kernel uses. This ensures that on-going DMA from the first kernel
-does not corrupt the second kernel. The first 640k of physical memory is
-needed irrespective of where the kernel loads at. Hence, this region is
-backed up before reboot.
-
-In the second kernel, "old memory" can be accessed in two ways. The
-first one is through a device interface. We can create a /dev/oldmem or
-whatever and write out the memory in raw format. The second interface is
-through /proc/vmcore. This exports the dump as an ELF format file which
-can be written out using any file copy command (cp, scp, etc). Further, gdb
-can be used to perform some minimal debugging on the dump file. Both these
-methods ensure that there is correct ordering of the dump pages (corresponding
-to the first 640k that has been relocated).
-
-SETUP
-=====
-
-1) Obtain the appropriate -mm tree patch and apply it on to the vanilla
- kernel tree.
-
-2) Two kernels need to be built in order to get this feature working.
-
- For the first kernel, choose the default values for the following options.
-
- a) Physical address where the kernel is loaded
- b) kexec system call
- c) kernel crash dumps
-
- All the options are under "Processor type and features"
-
- For the second kernel, change (a) to 16MB. If you want to choose another
- value here, ensure "location from where the crash dumping kernel will boot
- (MB)" under (c) reflects the same value.
-
- Also ensure you have CONFIG_HIGHMEM on.
-
-3) Boot into the first kernel. You are now ready to try out kexec based crash
- dumps.
-
-4) Load the second kernel to be booted using
-
- kexec -p <second-kernel> --args-linux --append="root=<root-dev> dump
- init 1 memmap=exactmap memmap=640k@0 memmap=32M@16M"
-
- Note that <second-kernel> has to be a vmlinux image. bzImage will not
- work, as of now.
-
-5) Enable kexec based dumping by
-
- echo 1 > /proc/kexec-dump
-
- If this is not set, the system will not do a kexec reboot in the event
- of a panic.
-
-6) System reboots into the second kernel when a panic occurs.
- You could write a module to call panic, for testing purposes.
-
-7) Write out the dump file using
-
- cp /proc/vmcore <dump-file>
-
-You can also access the dump as a device for a linear/raw view. To do this,
-you will need the kd-oldmem-<version>.patch built into the kernel. To create
-the device, type
-
- mknod /dev/oldmem c 1 12
-
-Use "dd" with suitable options for count, bs and skip to access specific
-portions of the dump.
-
-ANALYSIS
-========
-
-You can run gdb on the dump file copied out of /proc/vmcore. Use vmlinux built
-with -g and run
-
- gdb vmlinux <dump-file>
-
-Stack trace for the task on processor 0, register display, memory display
-work fine.
-
-TODO
-====
-
-1) Provide a kernel-pages only view for the dump. This could possibly turn up
- as /proc/vmcore-kern.
-2) Provide register contents of all processors (similar to what multi-threaded
- core dumps does).
-3) Modify "crash" to make it recognize this dump.
-4) Make the i386 kernel boot from any location so we can run the second kernel
- from the reserved location instead of the current approach.
-
-CONTACT
-=======
-
-Hariprasad Nellitheertha - hari at in dot ibm dot com
L: linux-kernel@vger.kernel.org
S: Maintained
-KEXEC
-P: Eric Biederman
-P: Randy Dunlap
-M: ebiederm@xmission.com
-M: rddunlap@osdl.org
-W: http://www.xmission.com/~ebiederm/files/kexec/
-W: http://developer.osdl.org/rddunlap/kexec/
-L: linux-kernel@vger.kernel.org
-L: fastboot@osdl.org
-S: Maintained
-
LANMEDIA WAN CARD DRIVER
P: Andrew Stanley-Jones
M: asj@lanmedia.com
}
}
interrupt_redirect_table = ramvec;
-#ifdef CRASH_DUMP_VECTOR
+#ifdef DUMP_VECTOR
ramvec_p = ramvec;
for (i = 0; i < NR_IRQS; i++) {
if ((i % 8) == 0)
ramvec[TRAP0_VEC] = VECTOR(system_call);
ramvec[TRAP3_VEC] = break_vec;
interrupt_redirect_table = ramvec;
-#ifdef CRASH_DUMP_VECTOR
+#ifdef DUMP_VECTOR
ramvec_p = ramvec;
for (i = 0; i < NR_IRQS; i++) {
if ((i % 8) == 0)
generate incorrect output with certain kernel constructs when
-mregparm=3 is used.
-config KERN_PHYS_OFFSET
- int "Physical address where the kernel is loaded (1-112)MB"
- range 1 112
- default "1"
- help
- This gives the physical address where the kernel is loaded.
- Primarily used in the case of kexec on panic where the
- recovery kernel needs to run at a different address than
- the panic-ed kernel.
-
-config KEXEC
- bool "kexec system call (EXPERIMENTAL)"
- depends on EXPERIMENTAL
- help
- kexec is a system call that implements the ability to shutdown your
- current kernel, and to start another kernel. It is like a reboot
- but it is indepedent of the system firmware. And like a reboot
- you can start any kernel with it, not just Linux.
-
- The name comes from the similiarity to the exec system call.
-
- It is an ongoing process to be certain the hardware in a machine
- is properly shutdown, so do not be surprised if this code does not
- initially work for you. It may help to enable device hotplugging
- support. As of this writing the exact hardware interface is
- strongly in flux, so no good recommendation can be made.
-
-config CRASH_DUMP
- bool "kernel crash dumps (EXPERIMENTAL)"
- depends on KEXEC
- help
- Generate crash dump using kexec.
-
-config BACKUP_BASE
- int "location from where the crash dumping kernel will boot (MB)"
- depends on CRASH_DUMP
- default 16
- help
- This is the location where the second kernel will boot from.
-
-config BACKUP_SIZE
- int "Size of memory used by the crash dumping kernel (MB)"
- depends on CRASH_DUMP
- range 16 64
- default 32
- help
- The size of the second kernel's memory.
endmenu
popl %esi # discard address
popl %esi # real mode pointer
xorl %ebx,%ebx
- ljmp $(__BOOT_CS), $KERN_PHYS_OFFSET
+ ljmp $(__BOOT_CS), $0x100000
/*
* We come here, if we were loaded high.
popl %ecx # lcount
popl %edx # high_buffer_start
popl %eax # hcount
- movl $KERN_PHYS_OFFSET,%edi
+ movl $0x100000,%edi
cli # make sure we don't get interrupted
ljmp $(__BOOT_CS), $0x1000 # and jump to the move routine
movsl
movl %ebx,%esi # Restore setup pointer
xorl %ebx,%ebx
- ljmp $(__BOOT_CS), $KERN_PHYS_OFFSET
+ ljmp $(__BOOT_CS), $0x100000
move_routine_end:
#include <linux/tty.h>
#include <video/edid.h>
#include <asm/io.h>
-#include <asm/segment.h>
/*
* gzip declarations
#else
if ((RM_ALT_MEM_K > RM_EXT_MEM_K ? RM_ALT_MEM_K : RM_EXT_MEM_K) < 1024) error("Less than 2MB of memory");
#endif
- output_data = (char *)KERN_PHYS_OFFSET; /* Points to 1M */
+ output_data = (char *)0x100000; /* Points to 1M */
free_mem_end_ptr = (long)real_mode;
}
low_buffer_size = low_buffer_end - LOW_BUFFER_START;
high_loaded = 1;
free_mem_end_ptr = (long)high_buffer_start;
- if ( (KERN_PHYS_OFFSET + low_buffer_size) > ((ulg)high_buffer_start)) {
- high_buffer_start = (uch *)(KERN_PHYS_OFFSET + low_buffer_size);
+ if ( (0x100000 + low_buffer_size) > ((ulg)high_buffer_start)) {
+ high_buffer_start = (uch *)(0x100000 + low_buffer_size);
mv->hcount = 0; /* say: we need not to move high_buffer */
}
else mv->hcount = -1;
obj-$(CONFIG_X86_MPPARSE) += mpparse.o
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o
obj-$(CONFIG_X86_IO_APIC) += io_apic.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_X86_NUMAQ) += numaq.o
obj-$(CONFIG_X86_SUMMIT_NUMA) += summit.o
obj-$(CONFIG_KPROBES) += kprobes.o
outb(0x70, 0x22);
outb(0x00, 0x23);
}
- else {
- /* Go back to Virtual Wire compatibility mode */
- unsigned long value;
-
- /* For the spurious interrupt use vector F, and enable it */
- value = apic_read(APIC_SPIV);
- value &= ~APIC_VECTOR_MASK;
- value |= APIC_SPIV_APIC_ENABLED;
- value |= 0xf;
- apic_write_around(APIC_SPIV, value);
-
- /* For LVT0 make it edge triggered, active high, external and enabled */
- value = apic_read(APIC_LVT0);
- value &= ~(APIC_MODE_MASK | APIC_SEND_PENDING |
- APIC_INPUT_POLARITY | APIC_LVT_REMOTE_IRR |
- APIC_LVT_LEVEL_TRIGGER | APIC_LVT_MASKED );
- value |= APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING;
- value = SET_APIC_DELIVERY_MODE(value, APIC_MODE_EXINT);
- apic_write_around(APIC_LVT0, value);
-
- /* For LVT1 make it edge triggered, active high, nmi and enabled */
- value = apic_read(APIC_LVT1);
- value &= ~(
- APIC_MODE_MASK | APIC_SEND_PENDING |
- APIC_INPUT_POLARITY | APIC_LVT_REMOTE_IRR |
- APIC_LVT_LEVEL_TRIGGER | APIC_LVT_MASKED);
- value |= APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING;
- value = SET_APIC_DELIVERY_MODE(value, APIC_MODE_NMI);
- apic_write_around(APIC_LVT1, value);
- }
}
void disable_local_APIC(void)
+++ /dev/null
-/*
- * Architecture specific (i386) functions for kexec based crash dumps.
- *
- * Created by: Hariprasad Nellitheertha (hari@in.ibm.com)
- *
- * Copyright (C) IBM Corporation, 2004. All rights reserved.
- *
- */
-
-#include <linux/init.h>
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/smp.h>
-#include <linux/irq.h>
-
-#include <asm/crash_dump.h>
-#include <asm/processor.h>
-#include <asm/hardirq.h>
-#include <asm/nmi.h>
-#include <asm/hw_irq.h>
-
-struct pt_regs crash_smp_regs[NR_CPUS];
-long crash_smp_current_task[NR_CPUS];
-
-#ifdef CONFIG_SMP
-static atomic_t waiting_for_dump_ipi;
-static int crash_dump_expect_ipi[NR_CPUS];
-extern void crash_dump_send_ipi(void);
-extern void stop_this_cpu(void *);
-
-static int crash_dump_nmi_callback(struct pt_regs *regs, int cpu)
-{
- if (!crash_dump_expect_ipi[cpu])
- return 0;
-
- crash_dump_expect_ipi[cpu] = 0;
- crash_dump_save_this_cpu(regs, cpu);
- atomic_dec(&waiting_for_dump_ipi);
-
- stop_this_cpu(NULL);
-
- return 1;
-}
-
-void __crash_dump_stop_cpus(void)
-{
- int i, cpu, other_cpus;
-
- preempt_disable();
- cpu = smp_processor_id();
- other_cpus = num_online_cpus()-1;
-
- if (other_cpus > 0) {
- atomic_set(&waiting_for_dump_ipi, other_cpus);
-
- for (i = 0; i < NR_CPUS; i++)
- crash_dump_expect_ipi[i] = (i != cpu && cpu_online(i));
-
- set_nmi_callback(crash_dump_nmi_callback);
- /* Ensure the new callback function is set before sending
- * out the IPI
- */
- wmb();
-
- crash_dump_send_ipi();
- while (atomic_read(&waiting_for_dump_ipi) > 0)
- cpu_relax();
-
- unset_nmi_callback();
- } else {
- local_irq_disable();
- disable_local_APIC();
- local_irq_enable();
- }
- preempt_enable();
-}
-#else
-void __crash_dump_stop_cpus(void) {}
-#endif
-
-void crash_get_current_regs(struct pt_regs *regs)
-{
- __asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
- __asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
- __asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
- __asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
- __asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
- __asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
- __asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
- __asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
- __asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
- __asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
- __asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
- __asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
- __asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
-
- regs->eip = (unsigned long)current_text_addr();
-}
-
-void crash_dump_save_this_cpu(struct pt_regs *regs, int cpu)
-{
- crash_smp_current_task[cpu] = (long)current;
- crash_smp_regs[cpu] = *regs;
-}
-
.long sys_mq_timedreceive /* 280 */
.long sys_mq_notify
.long sys_mq_getsetattr
- .long sys_kexec_load
+ .long sys_ni_syscall /* reserved for kexec */
.long sys_waitid
.long sys_ni_syscall /* 285 */ /* available */
.long sys_add_key
EXPORT_SYMBOL(csum_partial);
-#ifdef CONFIG_CRASH_DUMP
+#ifdef CONFIG_CRASH_DUMP_MODULE
#ifdef CONFIG_SMP
extern irq_desc_t irq_desc[NR_IRQS];
extern unsigned long irq_affinity[NR_IRQS];
EXPORT_SYMBOL(stop_this_cpu);
EXPORT_SYMBOL(dump_send_ipi);
#endif
-extern int page_is_ram(unsigned long);
-EXPORT_SYMBOL(page_is_ram);
+extern int pfn_is_ram(unsigned long);
+EXPORT_SYMBOL(pfn_is_ram);
#ifdef ARCH_HAS_NMI_WATCHDOG
EXPORT_SYMBOL(touch_nmi_watchdog);
#endif
return 0;
}
-static int i8259A_shutdown(struct sys_device *dev)
-{
- /* Put the i8259A into a quiescent state that
- * the kernel initialization code can get it
- * out of.
- */
- outb(0xff, 0x21); /* mask all of 8259A-1 */
- outb(0xff, 0xA1); /* mask all of 8259A-1 */
- return 0;
-}
-
static struct sysdev_class i8259_sysdev_class = {
set_kset_name("i8259"),
.suspend = i8259A_suspend,
.resume = i8259A_resume,
- .shutdown = i8259A_shutdown,
};
static struct sys_device device_i8259A = {
+++ /dev/null
-/*
- * machine_kexec.c - handle transition of Linux booting another kernel
- * Copyright (C) 2002-2004 Eric Biederman <ebiederm@xmission.com>
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#include <linux/mm.h>
-#include <linux/kexec.h>
-#include <linux/delay.h>
-#include <asm/pgtable.h>
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-#include <asm/mmu_context.h>
-#include <asm/io.h>
-#include <asm/apic.h>
-#include <asm/cpufeature.h>
-#include <asm/crash_dump.h>
-
-static inline unsigned long read_cr3(void)
-{
- unsigned long cr3;
- asm volatile("movl %%cr3,%0": "=r"(cr3));
- return cr3;
-}
-
-#define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE)))
-
-#define L0_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
-#define L1_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
-#define L2_ATTR (_PAGE_PRESENT)
-
-#define LEVEL0_SIZE (1UL << 12UL)
-
-#ifndef CONFIG_X86_PAE
-#define LEVEL1_SIZE (1UL << 22UL)
-static u32 pgtable_level1[1024] PAGE_ALIGNED;
-
-static void identity_map_page(unsigned long address)
-{
- unsigned long level1_index, level2_index;
- u32 *pgtable_level2;
-
- /* Find the current page table */
- pgtable_level2 = __va(read_cr3());
-
- /* Find the indexes of the physical address to identity map */
- level1_index = (address % LEVEL1_SIZE)/LEVEL0_SIZE;
- level2_index = address / LEVEL1_SIZE;
-
- /* Identity map the page table entry */
- pgtable_level1[level1_index] = address | L0_ATTR;
- pgtable_level2[level2_index] = __pa(pgtable_level1) | L1_ATTR;
-
- /* Flush the tlb so the new mapping takes effect.
- * Global tlb entries are not flushed but that is not an issue.
- */
- load_cr3(pgtable_level2);
-}
-
-#else
-#define LEVEL1_SIZE (1UL << 21UL)
-#define LEVEL2_SIZE (1UL << 30UL)
-static u64 pgtable_level1[512] PAGE_ALIGNED;
-static u64 pgtable_level2[512] PAGE_ALIGNED;
-
-static void identity_map_page(unsigned long address)
-{
- unsigned long level1_index, level2_index, level3_index;
- u64 *pgtable_level3;
-
- /* Find the current page table */
- pgtable_level3 = __va(read_cr3());
-
- /* Find the indexes of the physical address to identity map */
- level1_index = (address % LEVEL1_SIZE)/LEVEL0_SIZE;
- level2_index = (address % LEVEL2_SIZE)/LEVEL1_SIZE;
- level3_index = address / LEVEL2_SIZE;
-
- /* Identity map the page table entry */
- pgtable_level1[level1_index] = address | L0_ATTR;
- pgtable_level2[level2_index] = __pa(pgtable_level1) | L1_ATTR;
- set_64bit(&pgtable_level3[level3_index], __pa(pgtable_level2) | L2_ATTR);
-
- /* Flush the tlb so the new mapping takes effect.
- * Global tlb entries are not flushed but that is not an issue.
- */
- load_cr3(pgtable_level3);
-}
-#endif
-
-
-static void set_idt(void *newidt, __u16 limit)
-{
- unsigned char curidt[6];
-
- /* ia32 supports unaliged loads & stores */
- (*(__u16 *)(curidt)) = limit;
- (*(__u32 *)(curidt +2)) = (unsigned long)(newidt);
-
- __asm__ __volatile__ (
- "lidt %0\n"
- : "=m" (curidt)
- );
-};
-
-
-static void set_gdt(void *newgdt, __u16 limit)
-{
- unsigned char curgdt[6];
-
- /* ia32 supports unaligned loads & stores */
- (*(__u16 *)(curgdt)) = limit;
- (*(__u32 *)(curgdt +2)) = (unsigned long)(newgdt);
-
- __asm__ __volatile__ (
- "lgdt %0\n"
- : "=m" (curgdt)
- );
-};
-
-static void load_segments(void)
-{
-#define __STR(X) #X
-#define STR(X) __STR(X)
-
- __asm__ __volatile__ (
- "\tljmp $"STR(__KERNEL_CS)",$1f\n"
- "\t1:\n"
- "\tmovl $"STR(__KERNEL_DS)",%eax\n"
- "\tmovl %eax,%ds\n"
- "\tmovl %eax,%es\n"
- "\tmovl %eax,%fs\n"
- "\tmovl %eax,%gs\n"
- "\tmovl %eax,%ss\n"
- );
-#undef STR
-#undef __STR
-}
-
-typedef asmlinkage void (*relocate_new_kernel_t)(
- unsigned long indirection_page, unsigned long reboot_code_buffer,
- unsigned long start_address, unsigned int has_pae);
-
-const extern unsigned char relocate_new_kernel[];
-extern void relocate_new_kernel_end(void);
-const extern unsigned int relocate_new_kernel_size;
-
-/*
- * Do what every setup is needed on image and the
- * reboot code buffer to allow us to avoid allocations
- * later. Currently nothing.
- */
-int machine_kexec_prepare(struct kimage *image)
-{
- return 0;
-}
-
-void machine_kexec_cleanup(struct kimage *image)
-{
-}
-
-/*
- * We are going to do a memory preserving reboot. So, we copy over the
- * first 640k of memory into a backup location. Though the second kernel
- * boots from a different location, it still requires the first 640k.
- * Hence this backup.
- */
-void __crash_relocate_mem(unsigned long backup_addr, unsigned long backup_size)
-{
- unsigned long pfn, pfn_max;
- void *src_addr, *dest_addr;
- struct page *page;
-
- pfn_max = backup_size >> PAGE_SHIFT;
- for (pfn = 0; pfn < pfn_max; pfn++) {
- src_addr = phys_to_virt(pfn << PAGE_SHIFT);
- dest_addr = backup_addr + src_addr;
- if (!pfn_valid(pfn))
- continue;
- page = pfn_to_page(pfn);
- if (PageReserved(page))
- copy_page(dest_addr, src_addr);
- }
-}
-
-/*
- * Do not allocate memory (or fail in any way) in machine_kexec().
- * We are past the point of no return, committed to rebooting now.
- */
-void machine_kexec(struct kimage *image)
-{
- unsigned long indirection_page;
- unsigned long reboot_code_buffer;
- relocate_new_kernel_t rnk;
-
- /* Interrupts aren't acceptable while we reboot */
- local_irq_disable();
-
- /* Compute some offsets */
- reboot_code_buffer = page_to_pfn(image->control_code_page) << PAGE_SHIFT;
- indirection_page = image->head & PAGE_MASK;
-
- /* Set up an identity mapping for the reboot_code_buffer */
- identity_map_page(reboot_code_buffer);
-
- /* copy it out */
- memcpy((void *)reboot_code_buffer, relocate_new_kernel, relocate_new_kernel_size);
-
- /* The segment registers are funny things, they are
- * automatically loaded from a table, in memory wherever you
- * set them to a specific selector, but this table is never
- * accessed again you set the segment to a different selector.
- *
- * The more common model is are caches where the behide
- * the scenes work is done, but is also dropped at arbitrary
- * times.
- *
- * I take advantage of this here by force loading the
- * segments, before I zap the gdt with an invalid value.
- */
- load_segments();
- /* The gdt & idt are now invalid.
- * If you want to load them you must set up your own idt & gdt.
- */
- set_gdt(phys_to_virt(0),0);
- set_idt(phys_to_virt(0),0);
-
- /* now call it */
- rnk = (relocate_new_kernel_t) reboot_code_buffer;
- (*rnk)(indirection_page, reboot_code_buffer, image->start, cpu_has_pae);
-}
int reboot_thru_bios;
#ifdef CONFIG_SMP
+int reboot_smp = 0;
static int reboot_cpu = -1;
/* shamelessly grabbed from lib/vsprintf.c for readability */
#define is_digit(c) ((c) >= '0' && (c) <= '9')
break;
#ifdef CONFIG_SMP
case 's': /* "smp" reboot by executing reset on BSP or other CPU*/
+ reboot_smp = 1;
if (is_digit(*(str+1))) {
reboot_cpu = (int) (*(str+1) - '0');
if (is_digit(*(str+2)))
return 0;
}
+/*
+ * Some machines require the "reboot=s" commandline option, this quirk makes that automatic.
+ */
+static int __init set_smp_reboot(struct dmi_system_id *d)
+{
+#ifdef CONFIG_SMP
+ if (!reboot_smp) {
+ reboot_smp = 1;
+ printk(KERN_INFO "%s series board detected. Selecting SMP-method for reboots.\n", d->ident);
+ }
+#endif
+ return 0;
+}
+
+/*
+ * Some machines require the "reboot=b,s" commandline option, this quirk makes that automatic.
+ */
+static int __init set_smp_bios_reboot(struct dmi_system_id *d)
+{
+ set_smp_reboot(d);
+ set_bios_reboot(d);
+ return 0;
+}
+
static struct dmi_system_id __initdata reboot_dmi_table[] = {
{ /* Handle problems with rebooting on Dell 1300's */
- .callback = set_bios_reboot,
+ .callback = set_smp_bios_reboot,
.ident = "Dell PowerEdge 1300",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Computer Corporation"),
: "i" ((void *) (0x1000 - sizeof (real_mode_switch) - 100)));
}
-void machine_shutdown(void)
+void machine_restart(char * __unused)
{
#ifdef CONFIG_SMP
- int reboot_cpu_id;
-
- /* The boot cpu is always logical cpu 0 */
- reboot_cpu_id = 0;
-
- /* See if there has been given a command line override */
- if ((reboot_cpu_id != -1) && (reboot_cpu < NR_CPUS) &&
- cpu_isset(reboot_cpu, cpu_online_map)) {
- reboot_cpu_id = reboot_cpu;
+ int cpuid;
+
+ cpuid = GET_APIC_ID(apic_read(APIC_ID));
+
+ if (reboot_smp) {
+
+ /* check to see if reboot_cpu is valid
+ if its not, default to the BSP */
+ if ((reboot_cpu == -1) ||
+ (reboot_cpu > (NR_CPUS -1)) ||
+ !physid_isset(cpuid, phys_cpu_present_map))
+ reboot_cpu = boot_cpu_physical_apicid;
+
+ reboot_smp = 0; /* use this as a flag to only go through this once*/
+ /* re-run this function on the other CPUs
+ it will fall though this section since we have
+ cleared reboot_smp, and do the reboot if it is the
+ correct CPU, otherwise it halts. */
+ if (reboot_cpu != cpuid)
+ smp_call_function((void *)machine_restart , NULL, 1, 0);
}
- /* Make certain the cpu I'm rebooting on is online */
- if (!cpu_isset(reboot_cpu_id, cpu_online_map)) {
- reboot_cpu_id = smp_processor_id();
+ /* if reboot_cpu is still -1, then we want a tradional reboot,
+ and if we are not running on the reboot_cpu,, halt */
+ if ((reboot_cpu != -1) && (cpuid != reboot_cpu)) {
+ for (;;)
+ __asm__ __volatile__ ("hlt");
}
-
- /* Make certain I only run on the appropriate processor */
- set_cpus_allowed(current, cpumask_of_cpu(reboot_cpu_id));
-
- /* O.K. Now that I'm on the appropriate processor, stop
- * all of the others, and disable their local APICs.
+ /*
+ * Stop all CPUs and turn off local APICs and the IO-APIC, so
+ * other OSs see a clean IRQ state.
*/
-
smp_send_stop();
#endif /* CONFIG_SMP */
#ifdef CONFIG_X86_IO_APIC
disable_IO_APIC();
#endif
-}
-
-void machine_restart(char * __unused)
-{
- machine_shutdown();
if (!reboot_thru_bios) {
if (efi_enabled) {
+++ /dev/null
-/*
- * relocate_kernel.S - put the kernel image in place to boot
- * Copyright (C) 2002-2004 Eric Biederman <ebiederm@xmission.com>
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#include <linux/linkage.h>
-
- /*
- * Must be relocatable PIC code callable as a C function, that once
- * it starts can not use the previous processes stack.
- */
- .globl relocate_new_kernel
-relocate_new_kernel:
- /* read the arguments and say goodbye to the stack */
- movl 4(%esp), %ebx /* indirection_page */
- movl 8(%esp), %ebp /* reboot_code_buffer */
- movl 12(%esp), %edx /* start address */
- movl 16(%esp), %ecx /* cpu_has_pae */
-
- /* zero out flags, and disable interrupts */
- pushl $0
- popfl
-
- /* set a new stack at the bottom of our page... */
- lea 4096(%ebp), %esp
-
- /* store the parameters back on the stack */
- pushl %edx /* store the start address */
-
- /* Set cr0 to a known state:
- * 31 0 == Paging disabled
- * 18 0 == Alignment check disabled
- * 16 0 == Write protect disabled
- * 3 0 == No task switch
- * 2 0 == Don't do FP software emulation.
- * 0 1 == Proctected mode enabled
- */
- movl %cr0, %eax
- andl $~((1<<31)|(1<<18)|(1<<16)|(1<<3)|(1<<2)), %eax
- orl $(1<<0), %eax
- movl %eax, %cr0
-
- /* clear cr4 if applicable */
- testl %ecx, %ecx
- jz 1f
- /* Set cr4 to a known state:
- * Setting everything to zero seems safe.
- */
- movl %cr4, %eax
- andl $0, %eax
- movl %eax, %cr4
-
- jmp 1f
-1:
-
- /* Flush the TLB (needed?) */
- xorl %eax, %eax
- movl %eax, %cr3
-
- /* Do the copies */
- cld
-0: /* top, read another word for the indirection page */
- movl %ebx, %ecx
- movl (%ebx), %ecx
- addl $4, %ebx
- testl $0x1, %ecx /* is it a destination page */
- jz 1f
- movl %ecx, %edi
- andl $0xfffff000, %edi
- jmp 0b
-1:
- testl $0x2, %ecx /* is it an indirection page */
- jz 1f
- movl %ecx, %ebx
- andl $0xfffff000, %ebx
- jmp 0b
-1:
- testl $0x4, %ecx /* is it the done indicator */
- jz 1f
- jmp 2f
-1:
- testl $0x8, %ecx /* is it the source indicator */
- jz 0b /* Ignore it otherwise */
- movl %ecx, %esi /* For every source page do a copy */
- andl $0xfffff000, %esi
-
- movl $1024, %ecx
- rep ; movsl
- jmp 0b
-
-2:
-
- /* To be certain of avoiding problems with self-modifying code
- * I need to execute a serializing instruction here.
- * So I flush the TLB, it's handy, and not processor dependent.
- */
- xorl %eax, %eax
- movl %eax, %cr3
-
- /* set all of the registers to known values */
- /* leave %esp alone */
-
- xorl %eax, %eax
- xorl %ebx, %ebx
- xorl %ecx, %ecx
- xorl %edx, %edx
- xorl %esi, %esi
- xorl %edi, %edi
- xorl %ebp, %ebp
- ret
-relocate_new_kernel_end:
-
- .globl relocate_new_kernel_size
-relocate_new_kernel_size:
- .long relocate_new_kernel_end - relocate_new_kernel
#include <asm/io_apic.h>
#include <asm/ist.h>
#include <asm/io.h>
-#include <asm/crash_dump.h>
#include "setup_arch_pre.h"
#include <bios_ebda.h>
unsigned long init_pg_tables_end __initdata = ~0UL;
int disable_pse __initdata = 0;
-unsigned int dump_enabled;
/*
* Machine setup..
if (to != command_line)
to--;
if (!memcmp(from+7, "exactmap", 8)) {
- /* If we are doing a crash dump, we
- * still need to know the real mem
- * size.
- */
- set_saved_max_pfn();
from += 8+7;
e820.nr_map = 0;
userdef = 1;
*/
if (c == ' ' && !memcmp(from, "highmem=", 8))
highmem_pages = memparse(from+8, &from) >> PAGE_SHIFT;
-
- if (!memcmp(from, "dump", 4))
- dump_enabled = 1;
if (c == ' ' && !memcmp(from, "crashdump=", 10))
crashdump_addr = memparse(from+10, &from);
}
}
#endif
-
- crash_reserve_bootmem();
-
return max_low_pfn;
}
#else
*/
apic_wait_icr_idle();
- if (vector == CRASH_DUMP_VECTOR)
- cfg = (cfg&~APIC_VECTOR_MASK)|APIC_DM_NMI;
-
/*
* No need to touch the target chip field
*/
cfg = __prepare_ICR(shortcut, vector);
- if (vector == CRASH_DUMP_VECTOR) {
+ if (vector == DUMP_VECTOR) {
/*
* Setup DUMP IPI to be delivered as an NMI
*/
*/
cfg = __prepare_ICR(0, vector);
- if (vector == CRASH_DUMP_VECTOR) {
+ if (vector == DUMP_VECTOR) {
/*
* Setup DUMP IPI to be delivered as an NMI
*/
void dump_send_ipi(void)
{
- send_IPI_allbutself(CRASH_DUMP_VECTOR);
+ send_IPI_allbutself(DUMP_VECTOR);
}
/*
send_IPI_mask(cpumask_of_cpu(cpu), RESCHEDULE_VECTOR);
}
-void crash_dump_send_ipi(void)
-{
- send_IPI_allbutself(CRASH_DUMP_VECTOR);
-}
-
/*
* Structure and data for smp_call_function(). This is designed to minimise
* static memory requirements. It also looks cleaner.
* Written by Martin Mares <mj@atrey.karlin.mff.cuni.cz>;
*/
-#define LOAD_OFFSET __PAGE_OFFSET
-
#include <asm-generic/vmlinux.lds.h>
#include <asm/thread_info.h>
#include <asm/page.h>
-#include <asm/segment.h>
OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")
OUTPUT_ARCH(i386)
-ENTRY(phys_startup_32)
+ENTRY(startup_32)
jiffies = jiffies_64;
SECTIONS
{
- . = LOAD_OFFSET + KERN_PHYS_OFFSET;
- phys_startup_32 = startup_32 - LOAD_OFFSET;
+ . = __PAGE_OFFSET + 0x100000;
/* read-only */
_text = .; /* Text and read-only data */
- .text : AT(ADDR(.text) - LOAD_OFFSET) {
+ .text : {
*(.text)
SCHED_TEXT
LOCK_TEXT
. = ALIGN(16); /* Exception table */
__start___ex_table = .;
- __ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) { *(__ex_table) }
+ __ex_table : { *(__ex_table) }
__stop___ex_table = .;
RODATA
/* writeable */
- .data : AT(ADDR(.data) - LOAD_OFFSET) { /* Data */
+ .data : { /* Data */
*(.data)
CONSTRUCTORS
}
. = ALIGN(4096);
__nosave_begin = .;
- .data_nosave : AT(ADDR(.data_nosave) - LOAD_OFFSET) { *(.data.nosave) }
+ .data_nosave : { *(.data.nosave) }
. = ALIGN(4096);
__nosave_end = .;
. = ALIGN(4096);
- .data.page_aligned : AT(ADDR(.data.page_aligned) - LOAD_OFFSET) { *(.data.idt) }
+ .data.page_aligned : { *(.data.idt) }
. = ALIGN(32);
- .data.cacheline_aligned : AT(ADDR(.data.cacheline_aligned) - LOAD_OFFSET) {
- *(.data.cacheline_aligned)
- }
+ .data.cacheline_aligned : { *(.data.cacheline_aligned) }
_edata = .; /* End of data section */
. = ALIGN(THREAD_SIZE); /* init_task */
- .data.init_task : AT(ADDR(.data.init_task) - LOAD_OFFSET) { *(.data.init_task) }
+ .data.init_task : { *(.data.init_task) }
/* will be freed after init */
. = ALIGN(4096); /* Init code and data */
__init_begin = .;
- .init.text : AT(ADDR(.init.text) - LOAD_OFFSET) {
+ .init.text : {
_sinittext = .;
*(.init.text)
_einittext = .;
}
- .init.data : AT(ADDR(.init.data) - LOAD_OFFSET) { *(.init.data) }
+ .init.data : { *(.init.data) }
. = ALIGN(16);
__setup_start = .;
- .init.setup : AT(ADDR(.init.setup) - LOAD_OFFSET) { *(.init.setup) }
+ .init.setup : { *(.init.setup) }
__setup_end = .;
__initcall_start = .;
- .initcall.init : AT(ADDR(.initcall.init) - LOAD_OFFSET) {
+ .initcall.init : {
*(.initcall1.init)
*(.initcall2.init)
*(.initcall3.init)
}
__initcall_end = .;
__con_initcall_start = .;
- .con_initcall.init : AT(ADDR(.con_initcall.init) - LOAD_OFFSET) {
- *(.con_initcall.init)
- }
+ .con_initcall.init : { *(.con_initcall.init) }
__con_initcall_end = .;
SECURITY_INIT
. = ALIGN(4);
__alt_instructions = .;
- .altinstructions : AT(ADDR(.altinstructions) - LOAD_OFFSET) {
- *(.altinstructions)
- }
- __alt_instructions_end = .;
- .altinstr_replacement : AT(ADDR(.altinstr_replacement) - LOAD_OFFSET) {
- *(.altinstr_replacement)
- }
+ .altinstructions : { *(.altinstructions) }
+ __alt_instructions_end = .;
+ .altinstr_replacement : { *(.altinstr_replacement) }
/* .exit.text is discard at runtime, not link time, to deal with references
from .altinstructions and .eh_frame */
- .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) }
- .exit.data : AT(ADDR(.exit.data) - LOAD_OFFSET) { *(.exit.data) }
+ .exit.text : { *(.exit.text) }
+ .exit.data : { *(.exit.data) }
. = ALIGN(4096);
__initramfs_start = .;
- .init.ramfs : AT(ADDR(.init.ramfs) - LOAD_OFFSET) { *(.init.ramfs) }
+ .init.ramfs : { *(.init.ramfs) }
__initramfs_end = .;
. = ALIGN(32);
__per_cpu_start = .;
- .data.percpu : AT(ADDR(.data.percpu) - LOAD_OFFSET) { *(.data.percpu) }
+ .data.percpu : { *(.data.percpu) }
__per_cpu_end = .;
. = ALIGN(4096);
__init_end = .;
/* freed after init ends here */
__bss_start = .; /* BSS */
- .bss.page_aligned : AT(ADDR(.bss.page_aligned) - LOAD_OFFSET) {
- *(.bss.page_aligned) }
- .bss : AT(ADDR(.bss) - LOAD_OFFSET) {
+ .bss : {
+ *(.bss.page_aligned)
*(.bss)
}
. = ALIGN(4);
#include <asm/e820.h>
#include <asm/setup.h>
#include <asm/mmzone.h>
-#include <asm/crash_dump.h>
#include <bios_ebda.h>
struct pglist_data *node_data[MAX_NUMNODES];
}
}
#endif
-
- crash_reserve_bootmem();
-
return system_max_low_pfn;
}
preempt_check_resched();
}
-/* This is the same as kmap_atomic() but can map memory that doesn't
- * have a struct page associated with it.
- */
-char *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
-{
- enum fixed_addresses idx;
- unsigned long vaddr;
-
- inc_preempt_count();
-
- idx = type + KM_TYPE_NR*smp_processor_id();
- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
- set_pte(kmap_pte-idx, pfn_pte(pfn, kmap_prot));
- __flush_tlb_one(vaddr);
-
- return (char *)vaddr;
-}
-
struct page *kmap_atomic_to_page(void *ptr)
{
unsigned long idx, vaddr = (unsigned long)ptr;
pte = kmap_pte - (idx - FIX_KMAP_BEGIN);
return pte_page(*pte);
}
+
here. Saying Y here will not hurt performance (on any machine) but
will increase the size of the kernel.
-config KEXEC
- bool "kexec system call (EXPERIMENTAL)"
- depends on EXPERIMENTAL
- help
- kexec is a system call that implements the ability to shutdown your
- current kernel, and to start another kernel. It is like a reboot
- but it is indepedent of the system firmware. And like a reboot
- you can start any kernel with it, not just Linux.
-
- The name comes from the similiarity to the exec system call.
-
- It is an ongoing process to be certain the hardware in a machine
- is properly shutdown, so do not be surprised if this code does not
- initially work for you. It may help to enable device hotplugging
- support. As of this writing the exact hardware interface is
- strongly in flux, so no good recommendation can be made.
-
- In the GameCube implementation, kexec allows you to load and
- run DOL files, including kernel and homebrew DOLs.
-
source "drivers/cpufreq/Kconfig"
config CPU_FREQ_PMAC
obj-$(CONFIG_SMP) += smp.o smp-tbsync.o
obj-$(CONFIG_TAU) += temp.o
obj-$(CONFIG_ALTIVEC) += vecemu.o vector.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
ifndef CONFIG_MATH_EMULATION
obj-$(CONFIG_8xx) += softemu8xx.o
+++ /dev/null
-/*
- * machine_kexec.c - handle transition of Linux booting another kernel
- * Copyright (C) 2002-2003 Eric Biederman <ebiederm@xmission.com>
- *
- * GameCube/ppc32 port Copyright (C) 2004 Albert Herranz
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#include <linux/mm.h>
-#include <linux/kexec.h>
-#include <linux/delay.h>
-#include <linux/reboot.h>
-#include <asm/pgtable.h>
-#include <asm/pgalloc.h>
-#include <asm/mmu_context.h>
-#include <asm/io.h>
-#include <asm/hw_irq.h>
-#include <asm/cacheflush.h>
-#include <asm/machdep.h>
-
-typedef void (*relocate_new_kernel_t)(
- unsigned long indirection_page, unsigned long reboot_code_buffer,
- unsigned long start_address);
-
-const extern unsigned char relocate_new_kernel[];
-const extern unsigned int relocate_new_kernel_size;
-
-void machine_shutdown(void)
-{
- if (ppc_md.machine_shutdown) {
- ppc_md.machine_shutdown();
- }
-}
-
-/*
- * Do what every setup is needed on image and the
- * reboot code buffer to allow us to avoid allocations
- * later.
- */
-int machine_kexec_prepare(struct kimage *image)
-{
- if (ppc_md.machine_kexec_prepare) {
- return ppc_md.machine_kexec_prepare(image);
- }
- /*
- * Fail if platform doesn't provide its own machine_kexec_prepare
- * implementation.
- */
- return -ENOSYS;
-}
-
-void machine_kexec_cleanup(struct kimage *image)
-{
- if (ppc_md.machine_kexec_cleanup) {
- ppc_md.machine_kexec_cleanup(image);
- }
-}
-
-/*
- * Do not allocate memory (or fail in any way) in machine_kexec().
- * We are past the point of no return, committed to rebooting now.
- */
-void machine_kexec(struct kimage *image)
-{
- if (ppc_md.machine_kexec) {
- ppc_md.machine_kexec(image);
- } else {
- /*
- * Fall back to normal restart if platform doesn't provide
- * its own kexec function, and user insist to kexec...
- */
- machine_restart(NULL);
- }
-}
-
-
-/*
- * This is a generic machine_kexec function suitable at least for
- * non-OpenFirmware embedded platforms.
- * It merely copies the image relocation code to the control page and
- * jumps to it.
- * A platform specific function may just call this one.
- */
-void machine_kexec_simple(struct kimage *image)
-{
- unsigned long indirection_page;
- unsigned long reboot_code_buffer, reboot_code_buffer_phys;
- relocate_new_kernel_t rnk;
-
- /* Interrupts aren't acceptable while we reboot */
- local_irq_disable();
-
- indirection_page = image->head & PAGE_MASK;
-
- /* we need both effective and real address here */
- reboot_code_buffer =
- (unsigned long)page_address(image->control_code_page);
- reboot_code_buffer_phys = virt_to_phys((void *)reboot_code_buffer);
-
- /* copy our kernel relocation code to the control code page */
- memcpy((void *)reboot_code_buffer,
- relocate_new_kernel, relocate_new_kernel_size);
-
- flush_icache_range(reboot_code_buffer,
- reboot_code_buffer + KEXEC_CONTROL_CODE_SIZE);
- printk(KERN_INFO "Bye!\n");
-
- /* now call it */
- rnk = (relocate_new_kernel_t) reboot_code_buffer;
- (*rnk)(indirection_page, reboot_code_buffer_phys, image->start);
-}
-
+++ /dev/null
-/*
- * relocate_kernel.S - put the kernel image in place to boot
- * Copyright (C) 2002-2003 Eric Biederman <ebiederm@xmission.com>
- *
- * GameCube/ppc32 port Copyright (C) 2004 Albert Herranz
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#include <asm/reg.h>
-#include <asm/ppc_asm.h>
-#include <asm/processor.h>
-
-#include <asm/kexec.h>
-
-#define PAGE_SIZE 4096 /* must be same value as in <asm/page.h> */
-
-/* returns r3 = relocated address of sym */
-/* modifies r0 */
-#define RELOC_SYM(sym) \
- mflr r3; \
- bl 1f; \
-1: mflr r0; \
- mtlr r3; \
- lis r3, 1b@ha; \
- ori r3, r3, 1b@l; \
- subf r0, r3, r0; \
- lis r3, sym@ha; \
- ori r3, r3, sym@l; \
- add r3, r3, r0
-
- /*
- * Must be relocatable PIC code callable as a C function.
- */
- .globl relocate_new_kernel
-relocate_new_kernel:
- /* r3 = indirection_page */
- /* r4 = reboot_code_buffer */
- /* r5 = start_address */
-
- li r0, 0
-
- /*
- * Set Machine Status Register to a known status,
- * switch the MMU off and jump to 1: in a single step.
- */
-
- mr r8, r0
- ori r8, r8, MSR_RI|MSR_ME
- mtspr SRR1, r8
- addi r8, r4, 1f - relocate_new_kernel
- mtspr SRR0, r8
- sync
- rfi
-
-1:
- /* from this point address translation is turned off */
- /* and interrupts are disabled */
-
- /* set a new stack at the bottom of our page... */
- /* (not really needed now) */
- addi r1, r4, KEXEC_CONTROL_CODE_SIZE - 8 /* for LR Save+Back Chain */
- stw r0, 0(r1)
-
- /* Do the copies */
- li r6, 0 /* checksum */
- subi r3, r3, 4
-
-0: /* top, read another word for the indirection page */
- lwzu r0, 4(r3)
-
- /* is it a destination page? (r8) */
- rlwinm. r7, r0, 0, 31, 31 /* IND_DESTINATION (1<<0) */
- beq 1f
-
- rlwinm r8, r0, 0, 0, 19 /* clear kexec flags, page align */
- b 0b
-
-1: /* is it an indirection page? (r3) */
- rlwinm. r7, r0, 0, 30, 30 /* IND_INDIRECTION (1<<1) */
- beq 1f
-
- rlwinm r3, r0, 0, 0, 19 /* clear kexec flags, page align */
- subi r3, r3, 4
- b 0b
-
-1: /* are we done? */
- rlwinm. r7, r0, 0, 29, 29 /* IND_DONE (1<<2) */
- beq 1f
- b 2f
-
-1: /* is it a source page? (r9) */
- rlwinm. r7, r0, 0, 28, 28 /* IND_SOURCE (1<<3) */
- beq 0b
-
- rlwinm r9, r0, 0, 0, 19 /* clear kexec flags, page align */
-
- li r7, PAGE_SIZE / 4
- mtctr r7
- subi r9, r9, 4
- subi r8, r8, 4
-9:
- lwzu r0, 4(r9) /* do the copy */
- xor r6, r6, r0
- stwu r0, 4(r8)
- dcbst 0, r8
- sync
- icbi 0, r8
- bdnz 9b
-
- addi r9, r9, 4
- addi r8, r8, 4
- b 0b
-
-2:
-
- /* To be certain of avoiding problems with self-modifying code
- * execute a serializing instruction here.
- */
- isync
- sync
-
- /* jump to the entry point, usually the setup routine */
- mtlr r5
- blrl
-
-1: b 1b
-
-relocate_new_kernel_end:
-
- .globl relocate_new_kernel_size
-relocate_new_kernel_size:
- .long relocate_new_kernel_end - relocate_new_kernel
-
depends on IA32_EMULATION
default y
-config KEXEC
- bool "kexec system call (EXPERIMENTAL)"
- depends on EXPERIMENTAL
- help
- kexec is a system call that implements the ability to shutdown your
- current kernel, and to start another kernel. It is like a reboot
- but it is indepedent of the system firmware. And like a reboot
- you can start any kernel with it, not just Linux.
-
- The name comes from the similiarity to the exec system call.
-
- It is an ongoing process to be certain the hardware in a machine
- is properly shutdown, so do not be surprised if this code does not
- initially work for you. It may help to enable device hotplugging
- support. As of this writing the exact hardware interface is
- strongly in flux, so no good recommendation can be made.
-
endmenu
source drivers/Kconfig
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o
obj-$(CONFIG_X86_IO_APIC) += io_apic.o mpparse.o \
genapic.o genapic_cluster.o genapic_flat.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
obj-$(CONFIG_PM) += suspend.o
obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend_asm.o
obj-$(CONFIG_CPU_FREQ) += cpufreq/
outb(0x70, 0x22);
outb(0x00, 0x23);
}
- else {
- /* Go back to Virtual Wire compatibility mode */
- unsigned long value;
-
- /* For the spurious interrupt use vector F, and enable it */
- value = apic_read(APIC_SPIV);
- value &= ~APIC_VECTOR_MASK;
- value |= APIC_SPIV_APIC_ENABLED;
- value |= 0xf;
- apic_write_around(APIC_SPIV, value);
-
- /* For LVT0 make it edge triggered, active high, external and enabled */
- value = apic_read(APIC_LVT0);
- value &= ~(APIC_MODE_MASK | APIC_SEND_PENDING |
- APIC_INPUT_POLARITY | APIC_LVT_REMOTE_IRR |
- APIC_LVT_LEVEL_TRIGGER | APIC_LVT_MASKED );
- value |= APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING;
- value = SET_APIC_DELIVERY_MODE(value, APIC_MODE_EXINT);
- apic_write_around(APIC_LVT0, value);
-
- /* For LVT1 make it edge triggered, active high, nmi and enabled */
- value = apic_read(APIC_LVT1);
- value &= ~(
- APIC_MODE_MASK | APIC_SEND_PENDING |
- APIC_INPUT_POLARITY | APIC_LVT_REMOTE_IRR |
- APIC_LVT_LEVEL_TRIGGER | APIC_LVT_MASKED);
- value |= APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING;
- value = SET_APIC_DELIVERY_MODE(value, APIC_MODE_NMI);
- apic_write_around(APIC_LVT1, value);
- }
}
void disable_local_APIC(void)
int i;
for (i = 0; i < e820.nr_map; i++) {
struct resource *res;
+ if (e820.map[i].addr + e820.map[i].size > 0x100000000ULL)
+ continue;
res = alloc_bootmem_low(sizeof(struct resource));
switch (e820.map[i].type) {
case E820_RAM: res->name = "System RAM"; break;
return 0;
}
-
-
-static int i8259A_shutdown(struct sys_device *dev)
-{
- /* Put the i8259A into a quiescent state that
- * the kernel initialization code can get it
- * out of.
- */
- outb(0xff, 0x21); /* mask all of 8259A-1 */
- outb(0xff, 0xA1); /* mask all of 8259A-1 */
- return 0;
-}
-
static struct sysdev_class i8259_sysdev_class = {
set_kset_name("i8259"),
.suspend = i8259A_suspend,
.resume = i8259A_resume,
- .shutdown = i8259A_shutdown,
};
static struct sys_device device_i8259A = {
/*
* Find the pin to which IRQ[irq] (ISA) is connected
*/
-static int find_isa_irq_pin(int irq, int type)
+static int __init find_isa_irq_pin(int irq, int type)
{
int i;
*/
void disable_IO_APIC(void)
{
- int pin;
/*
* Clear the IO-APIC before rebooting:
*/
clear_IO_APIC();
- /*
- * If the i82559 is routed through an IOAPIC
- * Put that IOAPIC in virtual wire mode
- * so legacy interrups can be delivered.
- */
- pin = find_isa_irq_pin(0, mp_ExtINT);
- if (pin != -1) {
- struct IO_APIC_route_entry entry;
- unsigned long flags;
-
- memset(&entry, 0, sizeof(entry));
- entry.mask = 0; /* Enabled */
- entry.trigger = 0; /* Edge */
- entry.irr = 0;
- entry.polarity = 0; /* High */
- entry.delivery_status = 0;
- entry.dest_mode = 0; /* Physical */
- entry.delivery_mode = 7; /* ExtInt */
- entry.vector = 0;
- entry.dest.physical.physical_dest = 0;
-
-
- /*
- * Add it to the IO-APIC irq-routing table:
- */
- spin_lock_irqsave(&ioapic_lock, flags);
- io_apic_write(0, 0x11+2*pin, *(((int *)&entry)+1));
- io_apic_write(0, 0x10+2*pin, *(((int *)&entry)+0));
- spin_unlock_irqrestore(&ioapic_lock, flags);
- }
-
disconnect_bsp_APIC();
}
+++ /dev/null
-/*
- * machine_kexec.c - handle transition of Linux booting another kernel
- * Copyright (C) 2002-2004 Eric Biederman <ebiederm@xmission.com>
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#include <linux/mm.h>
-#include <linux/kexec.h>
-#include <linux/delay.h>
-#include <linux/string.h>
-#include <linux/reboot.h>
-#include <asm/pda.h>
-#include <asm/pgtable.h>
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-#include <asm/mmu_context.h>
-#include <asm/io.h>
-#include <asm/apic.h>
-#include <asm/cpufeature.h>
-#include <asm/hw_irq.h>
-
-#define LEVEL0_SIZE (1UL << 12UL)
-#define LEVEL1_SIZE (1UL << 21UL)
-#define LEVEL2_SIZE (1UL << 30UL)
-#define LEVEL3_SIZE (1UL << 39UL)
-#define LEVEL4_SIZE (1UL << 48UL)
-
-#define L0_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
-#define L1_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE)
-#define L2_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
-#define L3_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
-
-static void init_level2_page(
- uint64_t *level2p, unsigned long addr)
-{
- unsigned long end_addr;
- addr &= PAGE_MASK;
- end_addr = addr + LEVEL2_SIZE;
- while(addr < end_addr) {
- *(level2p++) = addr | L1_ATTR;
- addr += LEVEL1_SIZE;
- }
-}
-
-static int init_level3_page(struct kimage *image,
- uint64_t *level3p, unsigned long addr, unsigned long last_addr)
-{
- unsigned long end_addr;
- int result;
- result = 0;
- addr &= PAGE_MASK;
- end_addr = addr + LEVEL3_SIZE;
- while((addr < last_addr) && (addr < end_addr)) {
- struct page *page;
- uint64_t *level2p;
- page = kimage_alloc_control_pages(image, 0);
- if (!page) {
- result = -ENOMEM;
- goto out;
- }
- level2p = (uint64_t *)page_address(page);
- init_level2_page(level2p, addr);
- *(level3p++) = __pa(level2p) | L2_ATTR;
- addr += LEVEL2_SIZE;
- }
- /* clear the unused entries */
- while(addr < end_addr) {
- *(level3p++) = 0;
- addr += LEVEL2_SIZE;
- }
-out:
- return result;
-}
-
-
-static int init_level4_page(struct kimage *image,
- uint64_t *level4p, unsigned long addr, unsigned long last_addr)
-{
- unsigned long end_addr;
- int result;
- result = 0;
- addr &= PAGE_MASK;
- end_addr = addr + LEVEL4_SIZE;
- while((addr < last_addr) && (addr < end_addr)) {
- struct page *page;
- uint64_t *level3p;
- page = kimage_alloc_control_pages(image, 0);
- if (!page) {
- result = -ENOMEM;
- goto out;
- }
- level3p = (uint64_t *)page_address(page);
- result = init_level3_page(image, level3p, addr, last_addr);
- if (result) {
- goto out;
- }
- *(level4p++) = __pa(level3p) | L3_ATTR;
- addr += LEVEL3_SIZE;
- }
- /* clear the unused entries */
- while(addr < end_addr) {
- *(level4p++) = 0;
- addr += LEVEL3_SIZE;
- }
- out:
- return result;
-}
-
-
-static int init_pgtable(struct kimage *image, unsigned long start_pgtable)
-{
- uint64_t *level4p;
- level4p = (uint64_t *)__va(start_pgtable);
- return init_level4_page(image, level4p, 0, end_pfn << PAGE_SHIFT);
-}
-
-static void set_idt(void *newidt, __u16 limit)
-{
- unsigned char curidt[10];
-
- /* x86-64 supports unaliged loads & stores */
- (*(__u16 *)(curidt)) = limit;
- (*(__u64 *)(curidt +2)) = (unsigned long)(newidt);
-
- __asm__ __volatile__ (
- "lidt %0\n"
- : "=m" (curidt)
- );
-};
-
-
-static void set_gdt(void *newgdt, __u16 limit)
-{
- unsigned char curgdt[10];
-
- /* x86-64 supports unaligned loads & stores */
- (*(__u16 *)(curgdt)) = limit;
- (*(__u64 *)(curgdt +2)) = (unsigned long)(newgdt);
-
- __asm__ __volatile__ (
- "lgdt %0\n"
- : "=m" (curgdt)
- );
-};
-
-static void load_segments(void)
-{
- __asm__ __volatile__ (
- "\tmovl $"STR(__KERNEL_DS)",%eax\n"
- "\tmovl %eax,%ds\n"
- "\tmovl %eax,%es\n"
- "\tmovl %eax,%ss\n"
- "\tmovl %eax,%fs\n"
- "\tmovl %eax,%gs\n"
- );
-#undef STR
-#undef __STR
-}
-
-typedef void (*relocate_new_kernel_t)(
- unsigned long indirection_page, unsigned long control_code_buffer,
- unsigned long start_address, unsigned long pgtable);
-
-const extern unsigned char relocate_new_kernel[];
-extern void relocate_new_kernel_end(void);
-const extern unsigned long relocate_new_kernel_size;
-
-int machine_kexec_prepare(struct kimage *image)
-{
- unsigned long start_pgtable, control_code_buffer;
- int result;
-
- /* Calculate the offsets */
- start_pgtable = page_to_pfn(image->control_code_page) << PAGE_SHIFT;
- control_code_buffer = start_pgtable + 4096UL;
-
- /* Setup the identity mapped 64bit page table */
- result = init_pgtable(image, start_pgtable);
- if (result) {
- return result;
- }
-
- /* Place the code in the reboot code buffer */
- memcpy(__va(control_code_buffer), relocate_new_kernel, relocate_new_kernel_size);
-
- return 0;
-}
-
-void machine_kexec_cleanup(struct kimage *image)
-{
- return;
-}
-
-/*
- * Do not allocate memory (or fail in any way) in machine_kexec().
- * We are past the point of no return, committed to rebooting now.
- */
-void machine_kexec(struct kimage *image)
-{
- unsigned long indirection_page;
- unsigned long control_code_buffer;
- unsigned long start_pgtable;
- relocate_new_kernel_t rnk;
-
- /* Interrupts aren't acceptable while we reboot */
- local_irq_disable();
-
- /* Calculate the offsets */
- indirection_page = image->head & PAGE_MASK;
- start_pgtable = page_to_pfn(image->control_code_page) << PAGE_SHIFT;
- control_code_buffer = start_pgtable + 4096UL;
-
- /* Set the low half of the page table to my identity mapped
- * page table for kexec. Leave the high half pointing at the
- * kernel pages. Don't bother to flush the global pages
- * as that will happen when I fully switch to my identity mapped
- * page table anyway.
- */
-// memcpy(current->active_mm->pml4, __va(start_pgtable), PAGE_SIZE/2);
- __flush_tlb();
-
-
- /* The segment registers are funny things, they are
- * automatically loaded from a table, in memory wherever you
- * set them to a specific selector, but this table is never
- * accessed again unless you set the segment to a different selector.
- *
- * The more common model are caches where the behide
- * the scenes work is done, but is also dropped at arbitrary
- * times.
- *
- * I take advantage of this here by force loading the
- * segments, before I zap the gdt with an invalid value.
- */
- load_segments();
- /* The gdt & idt are now invalid.
- * If you want to load them you must set up your own idt & gdt.
- */
- set_gdt(phys_to_virt(0),0);
- set_idt(phys_to_virt(0),0);
- /* now call it */
- rnk = (relocate_new_kernel_t) control_code_buffer;
- (*rnk)(indirection_page, control_code_buffer, image->start, start_pgtable);
-}
[target] "b" (WARMBOOT_TRAMP));
}
-static inline void kb_wait(void)
-{
- int i;
-
- for (i=0; i<0x10000; i++)
- if ((inb_p(0x64) & 0x02) == 0)
- break;
-}
-
-void machine_shutdown(void)
-{
- /* Stop the cpus and apics */
#ifdef CONFIG_SMP
- int reboot_cpu_id;
-
- /* The boot cpu is always logical cpu 0 */
- reboot_cpu_id = 0;
-
- /* Make certain the cpu I'm about to reboot on is online */
- if (!cpu_isset(reboot_cpu_id, cpu_online_map)) {
- reboot_cpu_id = smp_processor_id();
+static void smp_halt(void)
+{
+ int cpuid = safe_smp_processor_id();
+ static int first_entry = 1;
+
+ if (first_entry) {
+ first_entry = 0;
+ smp_call_function((void *)machine_restart, NULL, 1, 0);
+ }
+
+ smp_stop_cpu();
+
+ /* AP calling this. Just halt */
+ if (cpuid != boot_cpu_id) {
+ for (;;)
+ asm("hlt");
}
- /* Make certain I only run on the appropriate processor */
- set_cpus_allowed(current, cpumask_of_cpu(reboot_cpu_id));
-
- /* O.K Now that I'm on the appropriate processor,
- * stop all of the others.
- */
- smp_send_stop();
-#endif
-
- local_irq_disable();
-
-#ifndef CONFIG_SMP
- disable_local_APIC();
+ /* Wait for all other CPUs to have run smp_stop_cpu */
+ while (!cpus_empty(cpu_online_map))
+ rep_nop();
+}
#endif
- disable_IO_APIC();
+static inline void kb_wait(void)
+{
+ int i;
- local_irq_enable();
+ for (i=0; i<0x10000; i++)
+ if ((inb_p(0x64) & 0x02) == 0)
+ break;
}
void machine_restart(char * __unused)
{
int i;
- machine_shutdown();
+#ifdef CONFIG_SMP
+ smp_halt();
+#endif
local_irq_disable();
+++ /dev/null
-/*
- * relocate_kernel.S - put the kernel image in place to boot
- * Copyright (C) 2002-2004 Eric Biederman <ebiederm@xmission.com>
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#include <linux/linkage.h>
-
- /*
- * Must be relocatable PIC code callable as a C function, that once
- * it starts can not use the previous processes stack.
- */
- .globl relocate_new_kernel
- .code64
-relocate_new_kernel:
- /* %rdi indirection_page
- * %rsi reboot_code_buffer
- * %rdx start address
- * %rcx page_table
- * %r8 arg5
- * %r9 arg6
- */
-
- /* zero out flags, and disable interrupts */
- pushq $0
- popfq
-
- /* set a new stack at the bottom of our page... */
- lea 4096(%rsi), %rsp
-
- /* store the parameters back on the stack */
- pushq %rdx /* store the start address */
-
- /* Set cr0 to a known state:
- * 31 1 == Paging enabled
- * 18 0 == Alignment check disabled
- * 16 0 == Write protect disabled
- * 3 0 == No task switch
- * 2 0 == Don't do FP software emulation.
- * 0 1 == Proctected mode enabled
- */
- movq %cr0, %rax
- andq $~((1<<18)|(1<<16)|(1<<3)|(1<<2)), %rax
- orl $((1<<31)|(1<<0)), %eax
- movq %rax, %cr0
-
- /* Set cr4 to a known state:
- * 10 0 == xmm exceptions disabled
- * 9 0 == xmm registers instructions disabled
- * 8 0 == performance monitoring counter disabled
- * 7 0 == page global disabled
- * 6 0 == machine check exceptions disabled
- * 5 1 == physical address extension enabled
- * 4 0 == page size extensions disabled
- * 3 0 == Debug extensions disabled
- * 2 0 == Time stamp disable (disabled)
- * 1 0 == Protected mode virtual interrupts disabled
- * 0 0 == VME disabled
- */
-
- movq $((1<<5)), %rax
- movq %rax, %cr4
-
- jmp 1f
-1:
-
- /* Switch to the identity mapped page tables,
- * and flush the TLB.
- */
- movq %rcx, %cr3
-
- /* Do the copies */
- movq %rdi, %rbx /* Put the indirection page in %rbx */
- xorq %rdi, %rdi
- xorq %rsi, %rsi
-
-0: /* top, read another word for the indirection page */
-
- movq (%rbx), %rcx
- addq $8, %rbx
- testq $0x1, %rcx /* is it a destination page? */
- jz 1f
- movq %rcx, %rdi
- andq $0xfffffffffffff000, %rdi
- jmp 0b
-1:
- testq $0x2, %rcx /* is it an indirection page? */
- jz 1f
- movq %rcx, %rbx
- andq $0xfffffffffffff000, %rbx
- jmp 0b
-1:
- testq $0x4, %rcx /* is it the done indicator? */
- jz 1f
- jmp 2f
-1:
- testq $0x8, %rcx /* is it the source indicator? */
- jz 0b /* Ignore it otherwise */
- movq %rcx, %rsi /* For ever source page do a copy */
- andq $0xfffffffffffff000, %rsi
-
- movq $512, %rcx
- rep ; movsq
- jmp 0b
-2:
-
- /* To be certain of avoiding problems with self-modifying code
- * I need to execute a serializing instruction here.
- * So I flush the TLB by reloading %cr3 here, it's handy,
- * and not processor dependent.
- */
- movq %cr3, %rax
- movq %rax, %cr3
-
- /* set all of the registers to known values */
- /* leave %rsp alone */
-
- xorq %rax, %rax
- xorq %rbx, %rbx
- xorq %rcx, %rcx
- xorq %rdx, %rdx
- xorq %rsi, %rsi
- xorq %rdi, %rdi
- xorq %rbp, %rbp
- xorq %r8, %r8
- xorq %r9, %r9
- xorq %r10, %r9
- xorq %r11, %r11
- xorq %r12, %r12
- xorq %r13, %r13
- xorq %r14, %r14
- xorq %r15, %r15
-
- ret
-relocate_new_kernel_end:
-
- .globl relocate_new_kernel_size
-relocate_new_kernel_size:
- .quad relocate_new_kernel_end - relocate_new_kernel
#
# Automatically generated make config: don't edit
-# Linux kernel version: 2.6.10-1.14_FC2.1.planetlab.2005.04.14
-# Sat May 7 01:45:01 2005
+# Linux kernel version: 2.6.10-1.14_FC2.1.planetlab
+# Wed Mar 2 15:48:12 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_RCFS_FS=y
CONFIG_CKRM_TYPE_TASKCLASS=y
CONFIG_CKRM_RES_NULL=m
-# CONFIG_CKRM_RES_MEM is not set
-# CONFIG_CKRM_TYPE_SOCKETCLASS is not set
CONFIG_CKRM_RES_NUMTASKS=y
-# CONFIG_CKRM_RES_NUMTASKS_FORKRATE is not set
CONFIG_CKRM_CPU_SCHEDULE=y
# CONFIG_CKRM_RES_BLKIO is not set
+# CONFIG_CKRM_RES_MEM is not set
CONFIG_CKRM_CPU_SCHEDULE_AT_BOOT=y
+# CONFIG_CKRM_TYPE_SOCKETCLASS is not set
CONFIG_CKRM_RBCE=y
-# CONFIG_CKRM_CRBCE is not set
CONFIG_SYSCTL=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_REGPARM=y
-CONFIG_KERN_PHYS_OFFSET=1
-CONFIG_KEXEC=y
-# CONFIG_CRASH_DUMP is not set
#
# Power management options (ACPI, APM)
CONFIG_MD_RAID6=m
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
-CONFIG_BLK_DEV_DM=y
+CONFIG_BLK_DEV_DM=m
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_MIRROR=m
CONFIG_INET_TUNNEL=m
# CONFIG_ACCEPT_QUEUES is not set
CONFIG_IP_TCPDIAG=m
-# CONFIG_IP_TCPDIAG_IPV6 is not set
+CONFIG_IP_TCPDIAG_IPV6=y
#
# IP: Virtual Server Configuration
#
CONFIG_IP_VS_FTP=m
CONFIG_ICMP_IPOD=y
-# CONFIG_IPV6 is not set
+CONFIG_IPV6=m
+CONFIG_IPV6_PRIVACY=y
+CONFIG_INET6_AH=m
+CONFIG_INET6_ESP=m
+CONFIG_INET6_IPCOMP=m
+CONFIG_INET6_TUNNEL=m
+CONFIG_IPV6_TUNNEL=m
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_BRIDGE_NETFILTER=y
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
# CONFIG_IP_NF_CT_PROTO_GRE is not set
+#
+# IPv6: Netfilter Configuration
+#
+# CONFIG_IP6_NF_QUEUE is not set
+CONFIG_IP6_NF_IPTABLES=m
+CONFIG_IP6_NF_MATCH_LIMIT=m
+CONFIG_IP6_NF_MATCH_MAC=m
+CONFIG_IP6_NF_MATCH_RT=m
+CONFIG_IP6_NF_MATCH_OPTS=m
+CONFIG_IP6_NF_MATCH_FRAG=m
+CONFIG_IP6_NF_MATCH_HL=m
+CONFIG_IP6_NF_MATCH_MULTIPORT=m
+CONFIG_IP6_NF_MATCH_OWNER=m
+CONFIG_IP6_NF_MATCH_MARK=m
+CONFIG_IP6_NF_MATCH_IPV6HEADER=m
+CONFIG_IP6_NF_MATCH_AHESP=m
+CONFIG_IP6_NF_MATCH_LENGTH=m
+CONFIG_IP6_NF_MATCH_EUI64=m
+CONFIG_IP6_NF_MATCH_PHYSDEV=m
+CONFIG_IP6_NF_FILTER=m
+CONFIG_IP6_NF_TARGET_LOG=m
+CONFIG_IP6_NF_MANGLE=m
+CONFIG_IP6_NF_TARGET_MARK=m
+CONFIG_IP6_NF_RAW=m
+
#
# Bridge: Netfilter Configuration
#
CONFIG_BRIDGE=m
CONFIG_VLAN_8021Q=m
# CONFIG_DECNET is not set
-CONFIG_LLC=m
+CONFIG_LLC=y
# CONFIG_LLC2 is not set
CONFIG_IPX=m
# CONFIG_IPX_INTERN is not set
CONFIG_NETPOLL_TRAP=y
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_HAMRADIO is not set
-# CONFIG_IRDA is not set
-# CONFIG_BT is not set
-# CONFIG_TUX is not set
+CONFIG_IRDA=m
+
+#
+# IrDA protocols
+#
+CONFIG_IRLAN=m
+CONFIG_IRNET=m
+CONFIG_IRCOMM=m
+# CONFIG_IRDA_ULTRA is not set
+
+#
+# IrDA options
+#
+CONFIG_IRDA_CACHE_LAST_LSAP=y
+CONFIG_IRDA_FAST_RR=y
+# CONFIG_IRDA_DEBUG is not set
+
+#
+# Infrared-port device drivers
+#
+
+#
+# SIR device drivers
+#
+CONFIG_IRTTY_SIR=m
+
+#
+# Dongle support
+#
+CONFIG_DONGLE=y
+CONFIG_ESI_DONGLE=m
+CONFIG_ACTISYS_DONGLE=m
+CONFIG_TEKRAM_DONGLE=m
+CONFIG_LITELINK_DONGLE=m
+CONFIG_MA600_DONGLE=m
+CONFIG_GIRBIL_DONGLE=m
+CONFIG_MCP2120_DONGLE=m
+CONFIG_OLD_BELKIN_DONGLE=m
+CONFIG_ACT200L_DONGLE=m
+
+#
+# Old SIR device drivers
+#
+CONFIG_IRPORT_SIR=m
+
+#
+# Old Serial dongle support
+#
+# CONFIG_DONGLE_OLD is not set
+
+#
+# FIR device drivers
+#
+CONFIG_USB_IRDA=m
+CONFIG_SIGMATEL_FIR=m
+CONFIG_TOSHIBA_FIR=m
+CONFIG_VLSI_FIR=m
+CONFIG_BT=m
+CONFIG_BT_L2CAP=m
+CONFIG_BT_SCO=m
+CONFIG_BT_RFCOMM=m
+CONFIG_BT_RFCOMM_TTY=y
+CONFIG_BT_BNEP=m
+CONFIG_BT_BNEP_MC_FILTER=y
+CONFIG_BT_BNEP_PROTO_FILTER=y
+CONFIG_BT_CMTP=m
+CONFIG_BT_HIDP=m
+
+#
+# Bluetooth device drivers
+#
+CONFIG_BT_HCIUSB=m
+CONFIG_BT_HCIUSB_SCO=y
+CONFIG_BT_HCIUART=m
+CONFIG_BT_HCIUART_H4=y
+CONFIG_BT_HCIUART_BCSP=y
+CONFIG_BT_HCIUART_BCSP_TXCRC=y
+CONFIG_BT_HCIBCM203X=m
+CONFIG_BT_HCIBFUSB=m
+CONFIG_BT_HCIDTL1=m
+CONFIG_BT_HCIBT3C=m
+CONFIG_BT_HCIBLUECARD=m
+CONFIG_BT_HCIBTUART=m
+CONFIG_BT_HCIVHCI=m
+CONFIG_TUX=m
+
+#
+# TUX options
+#
+CONFIG_TUX_EXTCGI=y
+CONFIG_TUX_EXTENDED_LOG=y
+# CONFIG_TUX_DEBUG is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_BONDING=m
#
# Token Ring devices
#
-# CONFIG_TR is not set
+CONFIG_TR=y
+CONFIG_IBMOL=m
+CONFIG_IBMLS=m
+CONFIG_3C359=m
+CONFIG_TMS380TR=m
+CONFIG_TMSPCI=m
+CONFIG_ABYSS=m
#
# Wireless LAN (non-hamradio)
CONFIG_PCMCIA_SMC91C92=m
CONFIG_PCMCIA_XIRC2PS=m
CONFIG_PCMCIA_AXNET=m
+CONFIG_PCMCIA_IBMTR=m
#
# Wan interfaces
# CONFIG_DEFXX is not set
CONFIG_SKFP=m
# CONFIG_HIPPI is not set
-# CONFIG_PLIP is not set
-# CONFIG_PPP is not set
-# CONFIG_SLIP is not set
+CONFIG_PLIP=m
+CONFIG_PPP=m
+CONFIG_PPP_MULTILINK=y
+CONFIG_PPP_FILTER=y
+CONFIG_PPP_ASYNC=m
+CONFIG_PPP_SYNC_TTY=m
+CONFIG_PPP_DEFLATE=m
+# CONFIG_PPP_BSDCOMP is not set
+CONFIG_PPPOE=m
+CONFIG_PPPOATM=m
+CONFIG_SLIP=m
+CONFIG_SLIP_COMPRESSED=y
+CONFIG_SLIP_SMART=y
+# CONFIG_SLIP_MODE_SLIP6 is not set
CONFIG_NET_FC=y
# CONFIG_SHAPER is not set
CONFIG_NETCONSOLE=m
#
# Sound
#
-# CONFIG_SOUND is not set
+CONFIG_SOUND=m
+
+#
+# Advanced Linux Sound Architecture
+#
+CONFIG_SND=m
+CONFIG_SND_TIMER=m
+CONFIG_SND_PCM=m
+CONFIG_SND_HWDEP=m
+CONFIG_SND_RAWMIDI=m
+CONFIG_SND_SEQUENCER=m
+CONFIG_SND_SEQ_DUMMY=m
+CONFIG_SND_OSSEMUL=y
+CONFIG_SND_MIXER_OSS=m
+CONFIG_SND_PCM_OSS=m
+CONFIG_SND_SEQUENCER_OSS=y
+CONFIG_SND_RTCTIMER=m
+# CONFIG_SND_VERBOSE_PRINTK is not set
+# CONFIG_SND_DEBUG is not set
+
+#
+# Generic devices
+#
+CONFIG_SND_MPU401_UART=m
+CONFIG_SND_OPL3_LIB=m
+CONFIG_SND_VX_LIB=m
+CONFIG_SND_DUMMY=m
+CONFIG_SND_VIRMIDI=m
+CONFIG_SND_MTPAV=m
+# CONFIG_SND_SERIAL_U16550 is not set
+CONFIG_SND_MPU401=m
+
+#
+# PCI devices
+#
+CONFIG_SND_AC97_CODEC=m
+CONFIG_SND_ALI5451=m
+CONFIG_SND_ATIIXP=m
+CONFIG_SND_ATIIXP_MODEM=m
+CONFIG_SND_AU8810=m
+CONFIG_SND_AU8820=m
+CONFIG_SND_AU8830=m
+CONFIG_SND_AZT3328=m
+CONFIG_SND_BT87X=m
+# CONFIG_SND_BT87X_OVERCLOCK is not set
+CONFIG_SND_CS46XX=m
+CONFIG_SND_CS46XX_NEW_DSP=y
+CONFIG_SND_CS4281=m
+CONFIG_SND_EMU10K1=m
+CONFIG_SND_KORG1212=m
+CONFIG_SND_MIXART=m
+CONFIG_SND_NM256=m
+CONFIG_SND_RME32=m
+CONFIG_SND_RME96=m
+CONFIG_SND_RME9652=m
+CONFIG_SND_HDSP=m
+CONFIG_SND_TRIDENT=m
+CONFIG_SND_YMFPCI=m
+CONFIG_SND_ALS4000=m
+CONFIG_SND_CMIPCI=m
+CONFIG_SND_ENS1370=m
+CONFIG_SND_ENS1371=m
+CONFIG_SND_ES1938=m
+CONFIG_SND_ES1968=m
+CONFIG_SND_MAESTRO3=m
+CONFIG_SND_FM801=m
+CONFIG_SND_FM801_TEA575X=m
+CONFIG_SND_ICE1712=m
+CONFIG_SND_ICE1724=m
+CONFIG_SND_INTEL8X0=m
+CONFIG_SND_INTEL8X0M=m
+CONFIG_SND_SONICVIBES=m
+CONFIG_SND_VIA82XX=m
+CONFIG_SND_VX222=m
+
+#
+# USB devices
+#
+CONFIG_SND_USB_AUDIO=m
+CONFIG_SND_USB_USX2Y=m
+
+#
+# PCMCIA devices
+#
+
+#
+# Open Sound System
+#
+# CONFIG_SOUND_PRIME is not set
#
# USB support
#
# USB Device Class drivers
#
-# CONFIG_USB_BLUETOOTH_TTY is not set
+# CONFIG_USB_AUDIO is not set
+
+#
+# USB Bluetooth TTY can only be used with disabled Bluetooth subsystem
+#
+CONFIG_USB_MIDI=m
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m
#include <linux/devfs_fs_kernel.h>
#include <linux/ptrace.h>
#include <linux/device.h>
-#include <linux/highmem.h>
-#include <linux/crash_dump.h>
#include <asm/uaccess.h>
#include <asm/io.h>
return 0;
}
-#ifdef CONFIG_CRASH_DUMP
-/*
- * Read memory corresponding to the old kernel.
- * If we are reading from the reserved section, which is
- * actually used by the current kernel, we just return zeroes.
- * Or if we are reading from the first 640k, we return from the
- * backed up area.
- */
-static ssize_t read_oldmem(struct file * file, char * buf,
- size_t count, loff_t *ppos)
-{
- unsigned long pfn;
- unsigned backup_start, backup_end, relocate_start;
- size_t read=0, csize;
-
- backup_start = CRASH_BACKUP_BASE / PAGE_SIZE;
- backup_end = backup_start + (CRASH_BACKUP_SIZE / PAGE_SIZE);
- relocate_start = (CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE) / PAGE_SIZE;
-
- while(count) {
- pfn = *ppos / PAGE_SIZE;
-
- csize = (count > PAGE_SIZE) ? PAGE_SIZE : count;
-
- /* Perform translation (see comment above) */
- if ((pfn >= backup_start) && (pfn < backup_end)) {
- if (clear_user(buf, csize)) {
- read = -EFAULT;
- goto done;
- }
-
- goto copy_done;
- } else if (pfn < (CRASH_RELOCATE_SIZE / PAGE_SIZE))
- pfn += relocate_start;
-
- if (pfn > saved_max_pfn) {
- read = 0;
- goto done;
- }
-
- if (copy_oldmem_page(pfn, buf, csize, 1)) {
- read = -EFAULT;
- goto done;
- }
-
-copy_done:
- buf += csize;
- *ppos += csize;
- read += csize;
- count -= csize;
- }
-done:
- return read;
-}
-#endif
-
extern long vread(char *buf, char *addr, unsigned long count);
extern long vwrite(char *buf, char *addr, unsigned long count);
#define read_full read_zero
#define open_mem open_port
#define open_kmem open_mem
-#define open_oldmem open_mem
static struct file_operations mem_fops = {
.llseek = memory_lseek,
.write = write_full,
};
-#ifdef CONFIG_CRASH_DUMP
-static struct file_operations oldmem_fops = {
- .read = read_oldmem,
- .open = open_oldmem,
-};
-#endif
-
static ssize_t kmsg_write(struct file * file, const char __user * buf,
size_t count, loff_t *ppos)
{
case 11:
filp->f_op = &kmsg_fops;
break;
-#ifdef CONFIG_CRASH_DUMP
- case 12:
- filp->f_op = &oldmem_fops;
- break;
-#endif
default:
return -ENXIO;
}
{8, "random", S_IRUGO | S_IWUSR, &random_fops},
{9, "urandom", S_IRUGO | S_IWUSR, &urandom_fops},
{11,"kmsg", S_IRUGO | S_IWUSR, &kmsg_fops},
-#ifdef CONFIG_CRASH_DUMP
- {12,"oldmem", S_IRUSR | S_IWUSR | S_IRGRP, &oldmem_fops},
-#endif
};
static struct class_simple *mem_class;
buf += sizeof(struct __dump_page);
while (len) {
- addr = kmap_atomic(page, KM_CRASHDUMP);
+ addr = kmap_atomic(page, KM_DUMP);
size = bytes = (len > PAGE_SIZE) ? PAGE_SIZE : len;
/* check for compression */
if (dump_allow_compress(page, bytes)) {
size = bytes;
}
/* memset(buf, 'A', size); temporary: testing only !! */
- kunmap_atomic(addr, KM_CRASHDUMP);
+ kunmap_atomic(addr, KM_DUMP);
dp->dp_size += size;
buf += size;
len -= bytes;
free_dha_stack();
}
-extern int page_is_ram(unsigned long);
+extern int pfn_is_ram(unsigned long);
/*
* Name: __dump_page_valid()
if (!pfn_valid(index))
return 0;
- return page_is_ram(index);
+ return pfn_is_ram(index);
}
/*
pr_debug("indirect map[%d] = 0x%lx\n", i, map1[i]);
page = pfn_to_page(map1[i]);
set_page_count(page, 1);
- map2 = kmap_atomic(page, KM_CRASHDUMP);
+ map2 = kmap_atomic(page, KM_DUMP);
for (j = 0 ; (j < DUMP_MAP_SZ) && map2[j] &&
(off + j < last); j++) {
pr_debug("\t map[%d][%d] = 0x%lx\n", i, j,
}
if (page)
- map = kmap_atomic(page, KM_CRASHDUMP);
+ map = kmap_atomic(page, KM_DUMP);
else
return NULL;
} else {
page = NULL;
}
- kunmap_atomic(map, KM_CRASHDUMP);
+ kunmap_atomic(map, KM_DUMP);
return page;
}
};
if (*dev->curr_map) {
- map = kmap_atomic(pfn_to_page(*dev->curr_map), KM_CRASHDUMP);
+ map = kmap_atomic(pfn_to_page(*dev->curr_map), KM_DUMP);
if (map[i])
page = pfn_to_page(map[i]);
- kunmap_atomic(map, KM_CRASHDUMP);
+ kunmap_atomic(map, KM_DUMP);
dev->ddev.curr_offset += PAGE_SIZE;
};
/* add data space */
i = dev->curr_map_offset;
map_page = pfn_to_page(*dev->curr_map);
- map = (unsigned long *)kmap_atomic(map_page, KM_CRASHDUMP);
+ map = (unsigned long *)kmap_atomic(map_page, KM_DUMP);
map[i] = page_to_pfn(page);
- kunmap_atomic(map, KM_CRASHDUMP);
+ kunmap_atomic(map, KM_DUMP);
dev->curr_map_offset = ++i;
dev->last_offset += PAGE_SIZE;
if (i >= DUMP_MAP_SZ) {
page = dump_mem_lookup(dump_mdev, dev->curr_offset >> PAGE_SHIFT);
for (n = len; (n > 0) && page; n -= PAGE_SIZE, buf += PAGE_SIZE ) {
- addr = kmap_atomic(page, KM_CRASHDUMP);
+ addr = kmap_atomic(page, KM_DUMP);
/* memset(addr, 'x', PAGE_SIZE); */
memcpy(addr, buf, PAGE_SIZE);
- kunmap_atomic(addr, KM_CRASHDUMP);
+ kunmap_atomic(addr, KM_DUMP);
/* dev->curr_offset += PAGE_SIZE; */
page = dump_mem_next_page(dump_mdev);
}
else
count++;
/* clear the contents of page */
- /* fixme: consider using KM_CRASHDUMP instead */
+ /* fixme: consider using KM_DUMP instead */
clear_highpage(page);
}
void *addr;
while (len < sz) {
- addr = kmap_atomic(page, KM_CRASHDUMP);
+ addr = kmap_atomic(page, KM_DUMP);
bytes = (sz > len + PAGE_SIZE) ? PAGE_SIZE : sz - len;
memcpy(buf, addr, bytes);
- kunmap_atomic(addr, KM_CRASHDUMP);
+ kunmap_atomic(addr, KM_DUMP);
buf += bytes;
len += bytes;
page++;
dump_sysrq_register(void)
{
#ifdef CONFIG_MAGIC_SYSRQ
- register_sysrq_key(DUMP_SYSRQ_KEY, &sysrq_crashdump_op);
+ __sysrq_lock_table();
+ __sysrq_put_key_op(DUMP_SYSRQ_KEY, &sysrq_crashdump_op);
+ __sysrq_unlock_table();
#endif
}
dump_sysrq_unregister(void)
{
#ifdef CONFIG_MAGIC_SYSRQ
- unregister_sysrq_key(DUMP_SYSRQ_KEY, &sysrq_crashdump_op);
+ __sysrq_lock_table();
+ if (__sysrq_get_key_op(DUMP_SYSRQ_KEY) == &sysrq_crashdump_op)
+ __sysrq_put_key_op(DUMP_SYSRQ_KEY, NULL);
+ __sysrq_unlock_table();
#endif
}
* (Note: this routine is intended to be called only
* from a kernel thread context)
*/
-void use_mm(struct mm_struct *mm)
+static void use_mm(struct mm_struct *mm)
{
struct mm_struct *active_mm;
struct task_struct *tsk = current;
error = vx_proc_ioctl(filp->f_dentry->d_inode, filp, cmd, arg);
break;
#endif
- case FIOC_SETIATTR:
- case FIOC_GETIATTR:
- /*
- * Verify that this filp is a file object,
- * not (say) a socket.
- */
- error = -ENOTTY;
- if (S_ISREG(filp->f_dentry->d_inode->i_mode) ||
- S_ISDIR(filp->f_dentry->d_inode->i_mode))
- error = vc_iattr_ioctl(filp->f_dentry,
- cmd, arg);
- break;
-
default:
error = -ENOTTY;
if (S_ISREG(filp->f_dentry->d_inode->i_mode))
const struct posix_acl_entry *pa, *pe, *mask_obj;
int found = 0;
- /* Prevent vservers from escaping chroot() barriers */
- if (IS_BARRIER(inode) && !vx_check(0, VX_ADMIN))
- return -EACCES;
-
FOREACH_ACL_ENTRY(pa, acl, pe) {
switch(pa->e_tag) {
case ACL_USER_OBJ:
kmsg.o proc_tty.o proc_misc.o
proc-$(CONFIG_PROC_KCORE) += kcore.o
-proc-$(CONFIG_CRASH_DUMP) += vmcore.o
proc-$(CONFIG_PROC_DEVICETREE) += proc_devtree.o
/*
* determine size of ELF note
*/
-int notesize(struct memelfnote *en)
+static int notesize(struct memelfnote *en)
{
int sz;
/*
* store a note in the header buffer
*/
-char *storenote(struct memelfnote *men, char *bufp)
+static char *storenote(struct memelfnote *men, char *bufp)
{
struct elf_note en;
* store an ELF coredump header in the supplied buffer
* nphdr is the number of elf_phdr to insert
*/
-void elf_kcore_store_hdr(char *bufp, int nphdr, int dataoff, struct kcore_list *clist)
+static void elf_kcore_store_hdr(char *bufp, int nphdr, int dataoff)
{
struct elf_prstatus prstatus; /* NT_PRSTATUS */
struct elf_prpsinfo prpsinfo; /* NT_PRPSINFO */
nhdr->p_align = 0;
/* setup ELF PT_LOAD program header for every area */
- for (m=clist; m; m=m->next) {
+ for (m=kclist; m; m=m->next) {
phdr = (struct elf_phdr *) bufp;
bufp += sizeof(struct elf_phdr);
offset += sizeof(struct elf_phdr);
return -ENOMEM;
}
memset(elf_buf, 0, elf_buflen);
- elf_kcore_store_hdr(elf_buf, nphdr, elf_buflen, kclist);
+ elf_kcore_store_hdr(elf_buf, nphdr, elf_buflen);
read_unlock(&kclist_lock);
if (copy_to_user(buffer, elf_buf + *fpos, tsz)) {
kfree(elf_buf);
#include <linux/jiffies.h>
#include <linux/sysrq.h>
#include <linux/vmalloc.h>
-#include <linux/crash_dump.h>
#include <linux/vs_base.h>
#include <linux/vs_cvirt.h>
(size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
}
#endif
- crash_create_proc_entry();
#ifdef CONFIG_MAGIC_SYSRQ
entry = create_proc_entry("sysrq-trigger", S_IWUSR, NULL);
if (entry)
entry->proc_fops = &proc_sysrq_trigger_operations;
#endif
- crash_enable_by_proc();
#ifdef CONFIG_PPC32
{
extern struct file_operations ppc_htab_operations;
+++ /dev/null
-/*
- * fs/proc/vmcore.c Interface for accessing the crash
- * dump from the system's previous life.
- * Heavily borrowed from fs/proc/kcore.c
- * Created by: Hariprasad Nellitheertha (hari@in.ibm.com)
- * Copyright (C) IBM Corporation, 2004. All rights reserved
- */
-
-#include <linux/config.h>
-#include <linux/mm.h>
-#include <linux/proc_fs.h>
-#include <linux/user.h>
-#include <linux/a.out.h>
-#include <linux/elf.h>
-#include <linux/elfcore.h>
-#include <linux/vmalloc.h>
-#include <linux/proc_fs.h>
-#include <linux/highmem.h>
-#include <linux/bootmem.h>
-#include <linux/init.h>
-#include <linux/crash_dump.h>
-#include <asm/uaccess.h>
-#include <asm/io.h>
-
-/* This is to re-use the kcore header creation code */
-static struct kcore_list vmcore_mem;
-
-static int open_vmcore(struct inode * inode, struct file * filp)
-{
- return 0;
-}
-
-static ssize_t read_vmcore(struct file *,char __user *,size_t, loff_t *);
-
-#define BACKUP_START CRASH_BACKUP_BASE
-#define BACKUP_END CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE
-#define REG_SIZE sizeof(elf_gregset_t)
-
-struct file_operations proc_vmcore_operations = {
- .read = read_vmcore,
- .open = open_vmcore,
-};
-
-struct proc_dir_entry *proc_vmcore;
-
-struct memelfnote
-{
- const char *name;
- int type;
- unsigned int datasz;
- void *data;
-};
-
-static size_t get_vmcore_size(int *nphdr, size_t *elf_buflen)
-{
- size_t size;
-
- /* We need 1 PT_LOAD segment headers
- * In addition, we need one PT_NOTE header
- */
- *nphdr = 2;
- size = (size_t)(saved_max_pfn << PAGE_SHIFT);
-
- *elf_buflen = sizeof(struct elfhdr) +
- (*nphdr + 2)*sizeof(struct elf_phdr) +
- 3 * sizeof(struct memelfnote) +
- sizeof(struct elf_prstatus) +
- sizeof(struct elf_prpsinfo) +
- sizeof(struct task_struct);
- *elf_buflen = PAGE_ALIGN(*elf_buflen);
- return size + *elf_buflen;
-}
-
-/*
- * Reads a page from the oldmem device from given offset.
- */
-static ssize_t read_from_oldmem(char *buf, size_t count,
- loff_t *ppos, int userbuf)
-{
- unsigned long pfn;
- size_t read = 0;
-
- pfn = (unsigned long)(*ppos / PAGE_SIZE);
-
- if (pfn > saved_max_pfn) {
- read = -EINVAL;
- goto done;
- }
-
- count = (count > PAGE_SIZE) ? PAGE_SIZE : count;
-
- if (copy_oldmem_page(pfn, buf, count, userbuf)) {
- read = -EFAULT;
- goto done;
- }
-
- *ppos += count;
-done:
- return read;
-}
-
-/*
- * store an ELF crash dump header in the supplied buffer
- * nphdr is the number of elf_phdr to insert
- */
-static void elf_vmcore_store_hdr(char *bufp, int nphdr, int dataoff)
-{
- struct elf_prstatus prstatus; /* NT_PRSTATUS */
- struct memelfnote notes[1];
- char reg_buf[REG_SIZE];
- loff_t reg_ppos;
- char *buf = bufp;
-
- vmcore_mem.addr = (unsigned long)__va(0);
- vmcore_mem.size = saved_max_pfn << PAGE_SHIFT;
- vmcore_mem.next = NULL;
-
- /* Re-use the kcore code */
- elf_kcore_store_hdr(bufp, nphdr, dataoff, &vmcore_mem);
- buf += sizeof(struct elfhdr) + 2*sizeof(struct elf_phdr);
-
- /* set up the process status */
- notes[0].name = "CORE";
- notes[0].type = NT_PRSTATUS;
- notes[0].datasz = sizeof(struct elf_prstatus);
- notes[0].data = &prstatus;
-
- memset(&prstatus, 0, sizeof(struct elf_prstatus));
-
- /* 1 - Get the registers from the reserved memory area */
- reg_ppos = BACKUP_END + CRASH_RELOCATE_SIZE;
- read_from_oldmem(reg_buf, REG_SIZE, ®_ppos, 0);
- elf_core_copy_regs(&prstatus.pr_reg, (struct pt_regs *)reg_buf);
- buf = storenote(¬es[0], buf);
-}
-
-/*
- * read from the ELF header and then the crash dump
- */
-static ssize_t read_vmcore(
-struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
-{
- ssize_t acc = 0;
- size_t size, tsz;
- size_t elf_buflen;
- int nphdr;
- unsigned long start;
-
- tsz = get_vmcore_size(&nphdr, &elf_buflen);
- proc_vmcore->size = size = tsz + elf_buflen;
- if (buflen == 0 || *fpos >= size) {
- goto done;
- }
-
- /* trim buflen to not go beyond EOF */
- if (buflen > size - *fpos)
- buflen = size - *fpos;
-
- /* construct an ELF core header if we'll need some of it */
- if (*fpos < elf_buflen) {
- char * elf_buf;
-
- tsz = elf_buflen - *fpos;
- if (buflen < tsz)
- tsz = buflen;
- elf_buf = kmalloc(elf_buflen, GFP_ATOMIC);
- if (!elf_buf) {
- acc = -ENOMEM;
- goto done;
- }
- memset(elf_buf, 0, elf_buflen);
- elf_vmcore_store_hdr(elf_buf, nphdr, elf_buflen);
- if (copy_to_user(buffer, elf_buf + *fpos, tsz)) {
- kfree(elf_buf);
- acc = -EFAULT;
- goto done;
- }
- kfree(elf_buf);
- buflen -= tsz;
- *fpos += tsz;
- buffer += tsz;
- acc += tsz;
-
- /* leave now if filled buffer already */
- if (buflen == 0) {
- goto done;
- }
- }
-
- start = *fpos - elf_buflen;
- if ((tsz = (PAGE_SIZE - (start & ~PAGE_MASK))) > buflen)
- tsz = buflen;
-
- while (buflen) {
- unsigned long p;
- loff_t pdup;
-
- if ((start < 0) || (start >= size))
- if (clear_user(buffer, tsz)) {
- acc = -EFAULT;
- goto done;
- }
-
- /* tsz contains actual len of dump to be read.
- * buflen is the total len that was requested.
- * This may contain part of ELF header. start
- * is the fpos for the oldmem region
- * If the file position corresponds to the second
- * kernel's memory, we just return zeroes
- */
- p = start;
- if ((p >= BACKUP_START) && (p < BACKUP_END)) {
- if (clear_user(buffer, tsz)) {
- acc = -EFAULT;
- goto done;
- }
-
- goto read_done;
- } else if (p < CRASH_RELOCATE_SIZE)
- p += BACKUP_END;
-
- pdup = p;
- if (read_from_oldmem(buffer, tsz, &pdup, 1)) {
- acc = -EINVAL;
- goto done;
- }
-
-read_done:
- buflen -= tsz;
- *fpos += tsz;
- buffer += tsz;
- acc += tsz;
- start += tsz;
- tsz = (buflen > PAGE_SIZE ? PAGE_SIZE : buflen);
- }
-
-done:
- return acc;
-}
}
#define SECURITY_INIT \
- .security_initcall.init : AT(ADDR(.security_initcall.init) - LOAD_OFFSET) {\
+ .security_initcall.init : { \
VMLINUX_SYMBOL(__security_initcall_start) = .; \
*(.security_initcall.init) \
VMLINUX_SYMBOL(__security_initcall_end) = .; \
#define APIC_LVT_REMOTE_IRR (1<<14)
#define APIC_INPUT_POLARITY (1<<13)
#define APIC_SEND_PENDING (1<<12)
-#define APIC_MODE_MASK 0x700
#define GET_APIC_DELIVERY_MODE(x) (((x)>>8)&0x7)
#define SET_APIC_DELIVERY_MODE(x,y) (((x)&~0x700)|((y)<<8))
#define APIC_MODE_FIXED 0x0
+++ /dev/null
-/* asm-i386/crash_dump.h */
-#include <linux/bootmem.h>
-#include <linux/irq.h>
-#include <asm/apic.h>
-
-#ifdef CONFIG_CRASH_DUMP
-extern unsigned int dump_enabled;
-extern unsigned int crashed;
-
-extern void __crash_relocate_mem(unsigned long, unsigned long);
-extern unsigned long __init find_max_low_pfn(void);
-extern void __init find_max_pfn(void);
-
-extern struct pt_regs crash_smp_regs[NR_CPUS];
-extern long crash_smp_current_task[NR_CPUS];
-extern void crash_dump_save_this_cpu(struct pt_regs *, int);
-extern void __crash_dump_stop_cpus(void);
-extern void crash_get_current_regs(struct pt_regs *regs);
-
-#define CRASH_BACKUP_BASE ((unsigned long)CONFIG_BACKUP_BASE * 0x100000)
-#define CRASH_BACKUP_SIZE ((unsigned long)CONFIG_BACKUP_SIZE * 0x100000)
-#define CRASH_RELOCATE_SIZE 0xa0000
-
-static inline void crash_relocate_mem(void)
-{
- if (crashed)
- __crash_relocate_mem(CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE,
- CRASH_RELOCATE_SIZE);
-}
-
-static inline void set_saved_max_pfn(void)
-{
- find_max_pfn();
- saved_max_pfn = find_max_low_pfn();
-}
-
-static inline void crash_reserve_bootmem(void)
-{
- if (!dump_enabled) {
- reserve_bootmem(CRASH_BACKUP_BASE,
- CRASH_BACKUP_SIZE + CRASH_RELOCATE_SIZE + PAGE_SIZE);
- }
-}
-
-static inline void crash_dump_stop_cpus(void)
-{
- int cpu;
-
- if (!crashed)
- return;
-
- cpu = smp_processor_id();
-
- crash_smp_current_task[cpu] = (long)current;
- crash_get_current_regs(&crash_smp_regs[cpu]);
-
- /* This also captures the register states of the other cpus */
- __crash_dump_stop_cpus();
-#if defined(CONFIG_X86_IO_APIC)
- disable_IO_APIC();
-#endif
-#if defined(CONFIG_X86_LOCAL_APIC)
- disconnect_bsp_APIC();
-#endif
-}
-
-static inline void crash_dump_save_registers(void)
-{
- void *addr;
-
- addr = __va(CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE + CRASH_RELOCATE_SIZE);
- memcpy(addr, crash_smp_regs, (sizeof(struct pt_regs)*NR_CPUS));
- addr += sizeof(struct pt_regs)*NR_CPUS;
- memcpy(addr, crash_smp_current_task, (sizeof(long)*NR_CPUS));
-}
-#else
-#define crash_relocate_mem() do { } while(0)
-#define set_saved_max_pfn() do { } while(0)
-#define crash_reserve_bootmem() do { } while(0)
-#define crash_dump_stop_cpus() do { } while(0)
-#define crash_dump_save_registers() do { } while(0)
-#endif
void kunmap(struct page *page);
void *kmap_atomic(struct page *page, enum km_type type);
void kunmap_atomic(void *kvaddr, enum km_type type);
-char *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
struct page *kmap_atomic_to_page(void *ptr);
#define flush_cache_kmaps() do { } while (0)
+++ /dev/null
-#ifndef _I386_KEXEC_H
-#define _I386_KEXEC_H
-
-#include <asm/fixmap.h>
-
-/*
- * KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
- * I.e. Maximum page that is mapped directly into kernel memory,
- * and kmap is not required.
- *
- * Someone correct me if FIXADDR_START - PAGEOFFSET is not the correct
- * calculation for the amount of memory directly mappable into the
- * kernel memory space.
- */
-
-/* Maximum physical address we can use pages from */
-#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
-/* Maximum address we can reach in physical address mode */
-#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
-/* Maximum address we can use for the control code buffer */
-#define KEXEC_CONTROL_MEMORY_LIMIT TASK_SIZE
-
-#define KEXEC_CONTROL_CODE_SIZE 4096
-
-#endif /* _I386_KEXEC_H */
#define INVALIDATE_TLB_VECTOR 0xfd
#define RESCHEDULE_VECTOR 0xfc
#define CALL_FUNCTION_VECTOR 0xfb
-#define CRASH_DUMP_VECTOR 0xfa
+#define DUMP_VECTOR 0xfa
#define THERMAL_APIC_VECTOR 0xf0
/*
extern void smp_invalidate_rcv(void); /* Process an NMI */
extern void (*mtrr_hook) (void);
extern void zap_low_mappings (void);
-extern void stop_this_cpu(void *);
#define MAX_APICID 256
extern u8 x86_cpu_to_apicid[];
+++ /dev/null
-#ifndef _PPC_KEXEC_H
-#define _PPC_KEXEC_H
-
-#ifdef CONFIG_KEXEC
-
-/*
- * KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
- * I.e. Maximum page that is mapped directly into kernel memory,
- * and kmap is not required.
- *
- * Someone correct me if FIXADDR_START - PAGEOFFSET is not the correct
- * calculation for the amount of memory directly mappable into the
- * kernel memory space.
- */
-
-/* Maximum physical address we can use pages from */
-#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
-/* Maximum address we can reach in physical address mode */
-#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
-/* Maximum address we can use for the control code buffer */
-#define KEXEC_CONTROL_MEMORY_LIMIT TASK_SIZE
-
-#define KEXEC_CONTROL_CODE_SIZE 4096
-
-
-#ifndef __ASSEMBLY__
-
-struct kimage;
-
-extern void machine_kexec_simple(struct kimage *image);
-
-#endif /* __ASSEMBLY__ */
-
-#endif /* CONFIG_KEXEC */
-
-#endif /* _PPC_KEXEC_H */
#include <linux/config.h>
#include <linux/init.h>
-#include <linux/kexec.h>
#include <asm/setup.h>
/* functions for dealing with other cpus */
struct smp_ops_t *smp_ops;
#endif /* CONFIG_SMP */
-
-#ifdef CONFIG_KEXEC
- /* Called to shutdown machine specific hardware not already controlled
- * by other drivers.
- * XXX Should we move this one out of kexec scope?
- */
- void (*machine_shutdown)(void);
-
- /* Called to do what every setup is needed on image and the
- * reboot code buffer. Returns 0 on success.
- * Provide your own (maybe dummy) implementation if your platform
- * claims to support kexec.
- */
- int (*machine_kexec_prepare)(struct kimage *image);
-
- /* Called to handle any machine specific cleanup on image */
- void (*machine_kexec_cleanup)(struct kimage *image);
-
- /* Called to perform the _real_ kexec.
- * Do NOT allocate memory or fail here. We are past the point of
- * no return.
- */
- void (*machine_kexec)(struct kimage *image);
-#endif /* CONFIG_KEXEC */
};
extern struct machdep_calls ppc_md;
+++ /dev/null
-#ifndef _X86_64_KEXEC_H
-#define _X86_64_KEXEC_H
-
-#include <asm/page.h>
-#include <asm/proto.h>
-
-/*
- * KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
- * I.e. Maximum page that is mapped directly into kernel memory,
- * and kmap is not required.
- *
- * So far x86_64 is limited to 40 physical address bits.
- */
-
-/* Maximum physical address we can use pages from */
-#define KEXEC_SOURCE_MEMORY_LIMIT (0xFFFFFFFFFFUL)
-/* Maximum address we can reach in physical address mode */
-#define KEXEC_DESTINATION_MEMORY_LIMIT (0xFFFFFFFFFFUL)
-/* Maximum address we can use for the control pages */
-#define KEXEC_CONTROL_MEMORY_LIMIT (0xFFFFFFFFFFUL)
-
-/* Allocate one page for the pdp and the second for the code */
-#define KEXEC_CONTROL_CODE_SIZE (4096UL + 4096UL)
-
-#endif /* _X86_64_KEXEC_H */
#define __NR_mq_getsetattr 245
__SYSCALL(__NR_mq_getsetattr, sys_mq_getsetattr)
#define __NR_kexec_load 246
-__SYSCALL(__NR_kexec_load, sys_kexec_load)
+__SYSCALL(__NR_kexec_load, sys_ni_syscall)
#define __NR_waitid 247
__SYSCALL(__NR_waitid, sys_waitid)
#define __NR_syscall_max __NR_waitid
* highest page
*/
extern unsigned long max_pfn;
-extern unsigned long saved_max_pfn;
/*
* node_bootmem_map is a map pointer - the bits represent all physical
*
*/
+/* Changes
+ *
+ * 31 Mar 2004
+ * Created.
+ */
+
#ifndef _LINUX_CKRM_TSK_H
#define _LINUX_CKRM_TSK_H
#ifdef CONFIG_CKRM_TYPE_TASKCLASS
#include <linux/ckrm_rc.h>
-typedef int (*get_ref_t) (struct ckrm_core_class *, int);
-typedef void (*put_ref_t) (struct ckrm_core_class *);
+typedef int (*get_ref_t) (void *, int);
+typedef void (*put_ref_t) (void *);
-extern int numtasks_get_ref(struct ckrm_core_class *, int);
-extern void numtasks_put_ref(struct ckrm_core_class *);
+extern int numtasks_get_ref(void *, int);
+extern void numtasks_put_ref(void *);
extern void ckrm_numtasks_register(get_ref_t, put_ref_t);
#else /* CONFIG_CKRM_TYPE_TASKCLASS */
-#define numtasks_get_ref(core_class, ref) (1)
-#define numtasks_put_ref(core_class) do {} while (0)
+#define numtasks_get_ref(a, b) (1)
+#define numtasks_put_ref(a) do {} while(0)
#endif /* CONFIG_CKRM_TYPE_TASKCLASS */
#endif /* _LINUX_CKRM_RES_H */
+++ /dev/null
-#include <linux/kexec.h>
-#include <linux/smp_lock.h>
-#include <linux/device.h>
-#include <linux/proc_fs.h>
-#ifdef CONFIG_CRASH_DUMP
-#include <asm/crash_dump.h>
-#endif
-
-extern unsigned long saved_max_pfn;
-extern struct memelfnote memelfnote;
-extern int notesize(struct memelfnote *);
-extern char *storenote(struct memelfnote *, char *);
-extern void elf_kcore_store_hdr(char *, int, int, struct kcore_list *);
-
-#ifdef CONFIG_CRASH_DUMP
-extern ssize_t copy_oldmem_page(unsigned long, char *, size_t, int);
-extern void __crash_machine_kexec(void);
-extern int crash_dump_on;
-static inline void crash_machine_kexec(void)
-{
- __crash_machine_kexec();
-}
-#else
-#define crash_machine_kexec() do { } while(0)
-#endif
-
-
-#if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PROC_FS)
-extern void crash_enable_by_proc(void);
-extern void crash_create_proc_entry(void);
-#else
-#define crash_enable_by_proc() do { } while(0)
-#define crash_create_proc_entry() do { } while(0)
-#endif
#ifndef _DUMP_H
#define _DUMP_H
-#if defined(CONFIG_CRASH_DUMP)
+#if defined(CONFIG_CRASH_DUMP) || defined (CONFIG_CRASH_DUMP_MODULE)
#include <linux/list.h>
#include <linux/notifier.h>
#define EXT2_RESERVED_FL 0x80000000 /* reserved for ext2 lib */
#ifdef CONFIG_VSERVER_LEGACY
-#define EXT2_FL_USER_VISIBLE 0x0C03DFFF /* User visible flags */
-#define EXT2_FL_USER_MODIFIABLE 0x0C0380FF /* User modifiable flags */
+#define EXT2_FL_USER_VISIBLE 0x0803DFFF /* User visible flags */
+#define EXT2_FL_USER_MODIFIABLE 0x080380FF /* User modifiable flags */
#else
#define EXT2_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */
#define EXT2_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */
#define EXT3_RESERVED_FL 0x80000000 /* reserved for ext3 lib */
#ifdef CONFIG_VSERVER_LEGACY
-#define EXT3_FL_USER_VISIBLE 0x0C03DFFF /* User visible flags */
-#define EXT3_FL_USER_MODIFIABLE 0x0C0380FF /* User modifiable flags */
+#define EXT3_FL_USER_VISIBLE 0x0803DFFF /* User visible flags */
+#define EXT3_FL_USER_MODIFIABLE 0x080380FF /* User modifiable flags */
#else
#define EXT3_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */
#define EXT3_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */
#define kmap_atomic(page, idx) page_address(page)
#define kunmap_atomic(addr, idx) do { } while (0)
-#define kmap_atomic_pfn(pfn, idx) ((char *)page_address(pfn_to_page(pfn)))
#define kmap_atomic_to_page(ptr) virt_to_page(ptr)
#endif /* CONFIG_HIGHMEM */
+++ /dev/null
-#ifndef LINUX_KEXEC_H
-#define LINUX_KEXEC_H
-
-#ifdef CONFIG_KEXEC
-#include <linux/types.h>
-#include <linux/list.h>
-#include <asm/kexec.h>
-
-/*
- * This structure is used to hold the arguments that are used when loading
- * kernel binaries.
- */
-
-typedef unsigned long kimage_entry_t;
-#define IND_DESTINATION 0x1
-#define IND_INDIRECTION 0x2
-#define IND_DONE 0x4
-#define IND_SOURCE 0x8
-
-#define KEXEC_SEGMENT_MAX 8
-struct kexec_segment {
- void *buf;
- size_t bufsz;
- void *mem;
- size_t memsz;
-};
-
-struct kimage {
- kimage_entry_t head;
- kimage_entry_t *entry;
- kimage_entry_t *last_entry;
-
- unsigned long destination;
-
- unsigned long start;
- struct page *control_code_page;
-
- unsigned long nr_segments;
- struct kexec_segment segment[KEXEC_SEGMENT_MAX];
-
- struct list_head control_pages;
- struct list_head dest_pages;
- struct list_head unuseable_pages;
-};
-
-
-/* kexec interface functions */
-extern void machine_kexec(struct kimage *image);
-extern int machine_kexec_prepare(struct kimage *image);
-extern void machine_kexec_cleanup(struct kimage *image);
-extern asmlinkage long sys_kexec(unsigned long entry, long nr_segments,
- struct kexec_segment *segments);
-extern struct page *kimage_alloc_control_pages(struct kimage *image, unsigned int order);
-extern struct kimage *kexec_image;
-extern struct kimage *kexec_crash_image;
-#endif
-#endif /* LINUX_KEXEC_H */
extern void machine_halt(void);
extern void machine_power_off(void);
-extern void machine_shutdown(void);
-
#endif
#endif /* _LINUX_REBOOT_H */
extern int vc_get_iattr(uint32_t, void __user *);
extern int vc_set_iattr(uint32_t, void __user *);
-extern int vc_iattr_ioctl(struct dentry *de,
- unsigned int cmd,
- unsigned long arg);
-
#endif /* __KERNEL__ */
/* inode ioctls */
#define FIOC_GETXFLG _IOR('x', 5, long)
#define FIOC_SETXFLG _IOW('x', 6, long)
-#define FIOC_GETIATTR _IOR('x', 7, long)
-#define FIOC_SETIATTR _IOR('x', 8, long)
-
#else /* _VX_INODE_H */
#warning duplicate inclusion
#endif /* _VX_INODE_H */
tristate "Null Tasks Resource Manager"
depends on CKRM_TYPE_TASKCLASS
default m
-
-config CKRM_RES_MEM
- bool "Class based physical memory controller"
- default y
- depends on CKRM
- help
- Provide the basic support for collecting physical memory usage
- information among classes. Say Y if you want to know the memory
- usage of each class.
-
-config CKRM_TYPE_SOCKETCLASS
- bool "Class Manager for socket groups"
- depends on CKRM && RCFS_FS
help
Provides a Null Resource Controller for CKRM that is purely for
demonstration purposes.
depends on CKRM_TYPE_TASKCLASS
default m
help
- Provides a Resource Controller for CKRM that allows limiting number of
+ Provides a Resource Controller for CKRM that allows limiting no of
tasks a task class can have.
Say N if unsure, Y to use the feature.
-config CKRM_RES_NUMTASKS_FORKRATE
- tristate "Number of Tasks Resource Manager for Fork Rate"
- depends on CKRM_RES_NUMTASKS
- default y
- help
- Provides a Resource Controller for CKRM that allows limiting the rate
- of tasks a task class can fork per hour.
-
- Say N if unsure, Y to use the feature.
-
-
config CKRM_CPU_SCHEDULE
bool "CKRM CPU scheduler"
depends on CKRM_TYPE_TASKCLASS
Say N if unsure, Y to use the feature.
+config CKRM_RES_MEM
+ bool "Class based physical memory controller"
+ default y
+ depends on CKRM
+ help
+ Provide the basic support for collecting physical memory usage information
+ among classes. Say Y if you want to know the memory usage of each class.
+
+config CKRM_MEM_LRUORDER_CHANGE
+ bool "Change the LRU ordering of scanned pages"
+ default n
+ depends on CKRM_RES_MEM
+ help
+ While trying to free pages, by default(n), scanned pages are left were they
+ are found if they belong to relatively under-used class. In this case the
+ LRU ordering of the memory subsystemis left intact. If this option is chosen,
+ then the scanned pages are moved to the tail of the list(active or inactive).
+ Changing this to yes reduces the checking overhead but violates the approximate
+ LRU order that is maintained by the paging subsystem.
+
config CKRM_CPU_SCHEDULE_AT_BOOT
bool "Turn on at boot time"
depends on CKRM_CPU_SCHEDULE
obj-$(CONFIG_KALLSYMS) += kallsyms.o
obj-$(CONFIG_PM) += power/
obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
-obj-$(CONFIG_KEXEC) += kexec.o
obj-$(CONFIG_COMPAT) += compat.o
obj-$(CONFIG_IKCONFIG) += configs.o
obj-$(CONFIG_IKCONFIG_PROC) += configs.o
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_SYSFS) += ksysfs.o
obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
-obj-$(CONFIG_CRASH_DUMP) += crash.o
ifneq ($(CONFIG_IA64),y)
# According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
struct ckrm_cpu_class *cls = my_res, *parres, *childres;
ckrm_core_class_t *child = NULL;
int maxlimit;
- ckrm_lrq_t* queue;
- int i;
if (!cls)
return;
/*the default class can't be freed*/
if (cls == get_default_cpu_class())
return;
-#if 1
-#warning "ACB: Remove freed class from any classqueues [PL #4233]"
- for (i = 0 ; i < NR_CPUS ; i++) {
- queue = &cls->local_queues[i];
- if (cls_in_classqueue(&queue->classqueue_linkobj))
- classqueue_dequeue(queue->classqueue,
- &queue->classqueue_linkobj);
- }
-#endif
// Assuming there will be no children when this function is called
parres = ckrm_get_cpu_class(cls->parent);
total_pressure += lrq->lrq_load;
}
-#define FIX_SHARES
-#ifdef FIX_SHARES
-#warning "ACB: fix share initialization problem [PL #4227]"
+#if 1
+#warning "ACB taking out suspicious early return"
#else
if (! total_pressure)
return;
/*give idle class a high share to boost interactiveness */
lw = cpu_class_weight(clsptr);
else {
-#ifdef FIX_SHARES
- if (! total_pressure)
- return;
-#endif
lw = lrq->lrq_load * class_weight;
do_div(lw,total_pressure);
if (!lw)
static int ckrm_cpu_monitord(void *nothing)
{
daemonize("ckrm_cpu_ctrld");
- current->flags |= PF_NOFREEZE;
-
for (;;) {
/*sleep for sometime before next try*/
- set_current_state(TASK_INTERRUPTIBLE);
+ set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(CPU_MONITOR_INTERVAL);
ckrm_cpu_monitor(1);
if (thread_exit) {
struct list_head *pos;
struct page *page;
-#if 0
printk("Check<%s> %s: total=%d\n",
str, res->core->name, atomic_read(&res->pg_total));
-#endif
for (i = 0; i < MAX_NR_ZONES; i++) {
act = 0; inact = 0;
ckrm_zone = &res->ckrm_zone[i];
act++;
}
spin_unlock_irq(&zone->lru_lock);
-#if 0
printk("Check<%s>(zone=%d): act %ld, inae %ld lact %d lina %d\n",
str, i, ckrm_zone->nr_active, ckrm_zone->nr_inactive,
act, inact);
-#endif
}
}
EXPORT_SYMBOL_GPL(check_memclass);
+++ /dev/null
-/* ckrm_memcore.c - Memory Resource Manager for CKRM
- *
- * Copyright (C) Jiantao Kong, IBM Corp. 2003
- * (C) Chandra Seetharaman, IBM Corp. 2004
- *
- * Provides a Memory Resource controller for CKRM
- *
- * Latest version, more details at http://ckrm.sf.net
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- */
-
-#include <linux/module.h>
-#include <linux/init.h>
-#include <linux/slab.h>
-#include <linux/list.h>
-#include <linux/spinlock.h>
-#include <linux/pagemap.h>
-#include <linux/swap.h>
-#include <linux/swapops.h>
-#include <linux/cache.h>
-#include <linux/percpu.h>
-#include <linux/pagevec.h>
-#include <linux/parser.h>
-#include <linux/ckrm_mem_inline.h>
-
-#include <asm/uaccess.h>
-#include <asm/pgtable.h>
-#include <asm/errno.h>
-
-#define MEM_RES_NAME "mem"
-
-#define CKRM_MEM_MAX_HIERARCHY 2 /* allows only upto 2 levels - 0, 1 & 2 */
-
-/* all 1-level memory_share_class are chained together */
-LIST_HEAD(ckrm_memclass_list);
-spinlock_t ckrm_mem_lock; /* protects list above */
-unsigned int ckrm_tot_lru_pages; /* # of pages in the system */
-int ckrm_nr_mem_classes = 0;
-struct ckrm_mem_res *ckrm_mem_root_class;
-atomic_t ckrm_mem_real_count = ATOMIC_INIT(0);
-
-EXPORT_SYMBOL_GPL(ckrm_memclass_list);
-EXPORT_SYMBOL_GPL(ckrm_mem_lock);
-EXPORT_SYMBOL_GPL(ckrm_tot_lru_pages);
-EXPORT_SYMBOL_GPL(ckrm_nr_mem_classes);
-EXPORT_SYMBOL_GPL(ckrm_mem_root_class);
-EXPORT_SYMBOL_GPL(ckrm_mem_real_count);
-
-void
-memclass_release(struct kref *kref)
-{
- struct ckrm_mem_res *cls = container_of(kref,
- struct ckrm_mem_res, nr_users);
- kfree(cls);
-}
-EXPORT_SYMBOL_GPL(memclass_release);
-
-static void
-set_ckrm_tot_pages(void)
-{
- struct zone *zone;
- int tot_lru_pages = 0;
-
- for_each_zone(zone) {
- tot_lru_pages += zone->nr_active;
- tot_lru_pages += zone->nr_inactive;
- tot_lru_pages += zone->free_pages;
- }
- ckrm_tot_lru_pages = tot_lru_pages;
-}
-
-/* Initialize rescls values
- * May be called on each rcfs unmount or as part of error recovery
- * to make share values sane.
- * Does not traverse hierarchy reinitializing children.
- */
-static void
-mem_res_initcls_one(struct ckrm_mem_res *res)
-{
- int zindex = 0;
- struct zone *zone;
-
- memset(res, 0, sizeof(struct ckrm_mem_res));
-
- res->shares.my_guarantee = CKRM_SHARE_DONTCARE;
- res->shares.my_limit = CKRM_SHARE_DONTCARE;
- res->shares.total_guarantee = CKRM_SHARE_DFLT_TOTAL_GUARANTEE;
- res->shares.max_limit = CKRM_SHARE_DFLT_MAX_LIMIT;
- res->shares.unused_guarantee = CKRM_SHARE_DFLT_TOTAL_GUARANTEE;
- res->shares.cur_max_limit = 0;
-
- res->pg_guar = CKRM_SHARE_DONTCARE;
- res->pg_limit = CKRM_SHARE_DONTCARE;
-
- INIT_LIST_HEAD(&res->mcls_list);
- INIT_LIST_HEAD(&res->shrink_list);
-
- for_each_zone(zone) {
- INIT_LIST_HEAD(&res->ckrm_zone[zindex].active_list);
- INIT_LIST_HEAD(&res->ckrm_zone[zindex].inactive_list);
- INIT_LIST_HEAD(&res->ckrm_zone[zindex].victim_list);
- res->ckrm_zone[zindex].nr_active = 0;
- res->ckrm_zone[zindex].nr_inactive = 0;
- res->ckrm_zone[zindex].zone = zone;
- res->ckrm_zone[zindex].memcls = res;
- zindex++;
- }
-
- res->pg_unused = 0;
- res->nr_dontcare = 1; /* for default class */
- kref_init(&res->nr_users);
-}
-
-static void
-set_impl_guar_children(struct ckrm_mem_res *parres)
-{
- struct ckrm_core_class *child = NULL;
- struct ckrm_mem_res *cres;
- int nr_dontcare = 1; // for defaultclass
- int guar, impl_guar;
- int resid = mem_rcbs.resid;
-
- ckrm_lock_hier(parres->core);
- while ((child = ckrm_get_next_child(parres->core, child)) != NULL) {
- cres = ckrm_get_res_class(child, resid, struct ckrm_mem_res);
- // treat NULL cres as don't care as that child is just being
- // created.
- // FIXME: need a better way to handle this case.
- if (!cres || cres->pg_guar == CKRM_SHARE_DONTCARE) {
- nr_dontcare++;
- }
- }
-
- parres->nr_dontcare = nr_dontcare;
- guar = (parres->pg_guar == CKRM_SHARE_DONTCARE) ?
- parres->impl_guar : parres->pg_unused;
- impl_guar = guar / parres->nr_dontcare;
-
- while ((child = ckrm_get_next_child(parres->core, child)) != NULL) {
- cres = ckrm_get_res_class(child, resid, struct ckrm_mem_res);
- if (cres && cres->pg_guar == CKRM_SHARE_DONTCARE) {
- cres->impl_guar = impl_guar;
- set_impl_guar_children(cres);
- }
- }
- ckrm_unlock_hier(parres->core);
-
-}
-
-static void *
-mem_res_alloc(struct ckrm_core_class *core, struct ckrm_core_class *parent)
-{
- struct ckrm_mem_res *res, *pres;
-
- BUG_ON(mem_rcbs.resid == -1);
-
- pres = ckrm_get_res_class(parent, mem_rcbs.resid, struct ckrm_mem_res);
- if (pres && (pres->hier == CKRM_MEM_MAX_HIERARCHY)) {
- printk(KERN_ERR "MEM_RC: only allows hieararchy of %d\n",
- CKRM_MEM_MAX_HIERARCHY);
- return NULL;
- }
-
- if ((parent == NULL) && (ckrm_mem_root_class != NULL)) {
- printk(KERN_ERR "MEM_RC: Only one root class is allowed\n");
- return NULL;
- }
-
- if ((parent != NULL) && (ckrm_mem_root_class == NULL)) {
- printk(KERN_ERR "MEM_RC: child class with no root class!!");
- return NULL;
- }
-
- res = kmalloc(sizeof(struct ckrm_mem_res), GFP_ATOMIC);
-
- if (res) {
- mem_res_initcls_one(res);
- res->core = core;
- res->parent = parent;
- spin_lock(&ckrm_mem_lock);
- list_add(&res->mcls_list, &ckrm_memclass_list);
- spin_unlock(&ckrm_mem_lock);
- if (parent == NULL) {
- /* I am the root class. So, set the max to *
- * number of pages available in the system */
- res->pg_guar = ckrm_tot_lru_pages;
- res->pg_unused = ckrm_tot_lru_pages;
- res->pg_limit = ckrm_tot_lru_pages;
- res->hier = 0;
- ckrm_mem_root_class = res;
- } else {
- int guar;
- res->hier = pres->hier + 1;
- set_impl_guar_children(pres);
- guar = (pres->pg_guar == CKRM_SHARE_DONTCARE) ?
- pres->impl_guar : pres->pg_unused;
- res->impl_guar = guar / pres->nr_dontcare;
- }
- ckrm_nr_mem_classes++;
- } else
- printk(KERN_ERR "MEM_RC: alloc: GFP_ATOMIC failed\n");
- return res;
-}
-
-/*
- * It is the caller's responsibility to make sure that the parent only
- * has chilren that are to be accounted. i.e if a new child is added
- * this function should be called after it has been added, and if a
- * child is deleted this should be called after the child is removed.
- */
-static void
-child_maxlimit_changed_local(struct ckrm_mem_res *parres)
-{
- int maxlimit = 0;
- struct ckrm_mem_res *childres;
- struct ckrm_core_class *child = NULL;
-
- /* run thru parent's children and get new max_limit of parent */
- ckrm_lock_hier(parres->core);
- while ((child = ckrm_get_next_child(parres->core, child)) != NULL) {
- childres = ckrm_get_res_class(child, mem_rcbs.resid,
- struct ckrm_mem_res);
- if (maxlimit < childres->shares.my_limit) {
- maxlimit = childres->shares.my_limit;
- }
- }
- ckrm_unlock_hier(parres->core);
- parres->shares.cur_max_limit = maxlimit;
-}
-
-/*
- * Recalculate the guarantee and limit in # of pages... and propagate the
- * same to children.
- * Caller is responsible for protecting res and for the integrity of parres
- */
-static void
-recalc_and_propagate(struct ckrm_mem_res * res, struct ckrm_mem_res * parres)
-{
- struct ckrm_core_class *child = NULL;
- struct ckrm_mem_res *cres;
- int resid = mem_rcbs.resid;
- struct ckrm_shares *self = &res->shares;
-
- if (parres) {
- struct ckrm_shares *par = &parres->shares;
-
- /* calculate pg_guar and pg_limit */
- if (parres->pg_guar == CKRM_SHARE_DONTCARE ||
- self->my_guarantee == CKRM_SHARE_DONTCARE) {
- res->pg_guar = CKRM_SHARE_DONTCARE;
- } else if (par->total_guarantee) {
- u64 temp = (u64) self->my_guarantee * parres->pg_guar;
- do_div(temp, par->total_guarantee);
- res->pg_guar = (int) temp;
- res->impl_guar = CKRM_SHARE_DONTCARE;
- } else {
- res->pg_guar = 0;
- res->impl_guar = CKRM_SHARE_DONTCARE;
- }
-
- if (parres->pg_limit == CKRM_SHARE_DONTCARE ||
- self->my_limit == CKRM_SHARE_DONTCARE) {
- res->pg_limit = CKRM_SHARE_DONTCARE;
- } else if (par->max_limit) {
- u64 temp = (u64) self->my_limit * parres->pg_limit;
- do_div(temp, par->max_limit);
- res->pg_limit = (int) temp;
- } else {
- res->pg_limit = 0;
- }
- }
-
- /* Calculate unused units */
- if (res->pg_guar == CKRM_SHARE_DONTCARE) {
- res->pg_unused = CKRM_SHARE_DONTCARE;
- } else if (self->total_guarantee) {
- u64 temp = (u64) self->unused_guarantee * res->pg_guar;
- do_div(temp, self->total_guarantee);
- res->pg_unused = (int) temp;
- } else {
- res->pg_unused = 0;
- }
-
- /* propagate to children */
- ckrm_lock_hier(res->core);
- while ((child = ckrm_get_next_child(res->core, child)) != NULL) {
- cres = ckrm_get_res_class(child, resid, struct ckrm_mem_res);
- recalc_and_propagate(cres, res);
- }
- ckrm_unlock_hier(res->core);
- return;
-}
-
-static void
-mem_res_free(void *my_res)
-{
- struct ckrm_mem_res *res = my_res;
- struct ckrm_mem_res *pres;
-
- if (!res)
- return;
-
- ckrm_mem_migrate_all_pages(res, ckrm_mem_root_class);
-
- pres = ckrm_get_res_class(res->parent, mem_rcbs.resid,
- struct ckrm_mem_res);
-
- if (pres) {
- child_guarantee_changed(&pres->shares,
- res->shares.my_guarantee, 0);
- child_maxlimit_changed_local(pres);
- recalc_and_propagate(pres, NULL);
- set_impl_guar_children(pres);
- }
-
- /*
- * Making it all zero as freeing of data structure could
- * happen later.
- */
- res->shares.my_guarantee = 0;
- res->shares.my_limit = 0;
- res->pg_guar = 0;
- res->pg_limit = 0;
- res->pg_unused = 0;
-
- spin_lock(&ckrm_mem_lock);
- list_del_init(&res->mcls_list);
- spin_unlock(&ckrm_mem_lock);
-
- res->core = NULL;
- res->parent = NULL;
- kref_put(&res->nr_users, memclass_release);
- ckrm_nr_mem_classes--;
- return;
-}
-
-static int
-mem_set_share_values(void *my_res, struct ckrm_shares *shares)
-{
- struct ckrm_mem_res *res = my_res;
- struct ckrm_mem_res *parres;
- int rc;
-
- if (!res)
- return -EINVAL;
-
- parres = ckrm_get_res_class(res->parent, mem_rcbs.resid,
- struct ckrm_mem_res);
-
- rc = set_shares(shares, &res->shares, parres ? &parres->shares : NULL);
-
- if ((rc == 0) && (parres != NULL)) {
- child_maxlimit_changed_local(parres);
- recalc_and_propagate(parres, NULL);
- set_impl_guar_children(parres);
- }
-
- return rc;
-}
-
-static int
-mem_get_share_values(void *my_res, struct ckrm_shares *shares)
-{
- struct ckrm_mem_res *res = my_res;
-
- if (!res)
- return -EINVAL;
- printk(KERN_INFO "get_share called for %s resource of class %s\n",
- MEM_RES_NAME, res->core->name);
- *shares = res->shares;
- return 0;
-}
-
-static int
-mem_get_stats(void *my_res, struct seq_file *sfile)
-{
- struct ckrm_mem_res *res = my_res;
- struct zone *zone;
- int active = 0, inactive = 0, fr = 0;
-
- if (!res)
- return -EINVAL;
-
- seq_printf(sfile, "--------- Memory Resource stats start ---------\n");
- if (res == ckrm_mem_root_class) {
- int i = 0;
- for_each_zone(zone) {
- active += zone->nr_active;
- inactive += zone->nr_inactive;
- fr += zone->free_pages;
- i++;
- }
- seq_printf(sfile,"System: tot_pages=%d,active=%d,inactive=%d"
- ",free=%d\n", ckrm_tot_lru_pages,
- active, inactive, fr);
- }
- seq_printf(sfile, "Number of pages used(including pages lent to"
- " children): %d\n", atomic_read(&res->pg_total));
- seq_printf(sfile, "Number of pages guaranteed: %d\n",
- res->pg_guar);
- seq_printf(sfile, "Maximum limit of pages: %d\n",
- res->pg_limit);
- seq_printf(sfile, "Total number of pages available"
- "(after serving guarantees to children): %d\n",
- res->pg_unused);
- seq_printf(sfile, "Number of pages lent to children: %d\n",
- res->pg_lent);
- seq_printf(sfile, "Number of pages borrowed from the parent: %d\n",
- res->pg_borrowed);
- seq_printf(sfile, "---------- Memory Resource stats end ----------\n");
-
- return 0;
-}
-
-static void
-mem_change_resclass(void *tsk, void *old, void *new)
-{
- struct mm_struct *mm;
- struct task_struct *task = tsk, *t1;
- struct ckrm_mem_res *prev_mmcls;
-
- if (!task->mm || (new == old) || (old == (void *) -1))
- return;
-
- mm = task->active_mm;
- spin_lock(&mm->peertask_lock);
- prev_mmcls = mm->memclass;
-
- if (new == NULL) {
- list_del_init(&task->mm_peers);
- } else {
- int found = 0;
- list_for_each_entry(t1, &mm->tasklist, mm_peers) {
- if (t1 == task) {
- found++;
- break;
- }
- }
- if (!found) {
- list_del_init(&task->mm_peers);
- list_add_tail(&task->mm_peers, &mm->tasklist);
- }
- }
-
- spin_unlock(&mm->peertask_lock);
- ckrm_mem_migrate_mm(mm, (struct ckrm_mem_res *) new);
- return;
-}
-
-#define MEM_FAIL_OVER "fail_over"
-#define MEM_SHRINK_AT "shrink_at"
-#define MEM_SHRINK_TO "shrink_to"
-#define MEM_SHRINK_COUNT "num_shrinks"
-#define MEM_SHRINK_INTERVAL "shrink_interval"
-
-int ckrm_mem_fail_at = 110;
-int ckrm_mem_shrink_at = 90;
-int ckrm_mem_shrink_to = 80;
-int ckrm_mem_shrink_count = 10;
-int ckrm_mem_shrink_interval = 10;
-
-EXPORT_SYMBOL_GPL(ckrm_mem_fail_at);
-EXPORT_SYMBOL_GPL(ckrm_mem_shrink_at);
-EXPORT_SYMBOL_GPL(ckrm_mem_shrink_to);
-
-static int
-mem_show_config(void *my_res, struct seq_file *sfile)
-{
- struct ckrm_mem_res *res = my_res;
-
- if (!res)
- return -EINVAL;
-
- seq_printf(sfile, "res=%s,%s=%d,%s=%d,%s=%d,%s=%d,%s=%d\n",
- MEM_RES_NAME,
- MEM_FAIL_OVER, ckrm_mem_fail_at,
- MEM_SHRINK_AT, ckrm_mem_shrink_at,
- MEM_SHRINK_TO, ckrm_mem_shrink_to,
- MEM_SHRINK_COUNT, ckrm_mem_shrink_count,
- MEM_SHRINK_INTERVAL, ckrm_mem_shrink_interval);
-
- return 0;
-}
-
-typedef int __bitwise memclass_token_t;
-
-enum memclass_token {
- mem_fail_over = (__force memclass_token_t) 1,
- mem_shrink_at = (__force memclass_token_t) 2,
- mem_shrink_to = (__force memclass_token_t) 3,
- mem_shrink_count = (__force memclass_token_t) 4,
- mem_shrink_interval = (__force memclass_token_t) 5,
- mem_err = (__force memclass_token_t) 6
-};
-
-static match_table_t mem_tokens = {
- {mem_fail_over, MEM_FAIL_OVER "=%d"},
- {mem_shrink_at, MEM_SHRINK_AT "=%d"},
- {mem_shrink_to, MEM_SHRINK_TO "=%d"},
- {mem_shrink_count, MEM_SHRINK_COUNT "=%d"},
- {mem_shrink_interval, MEM_SHRINK_INTERVAL "=%d"},
- {mem_err, NULL},
-};
-
-static int
-mem_set_config(void *my_res, const char *cfgstr)
-{
- char *p;
- struct ckrm_mem_res *res = my_res;
- int err = 0, val;
-
- if (!res)
- return -EINVAL;
-
- while ((p = strsep((char**)&cfgstr, ",")) != NULL) {
- substring_t args[MAX_OPT_ARGS];
- int token;
- if (!*p)
- continue;
-
- token = match_token(p, mem_tokens, args);
- switch (token) {
- case mem_fail_over:
- if (match_int(args, &val) || (val <= 0)) {
- err = -EINVAL;
- } else {
- ckrm_mem_fail_at = val;
- }
- break;
- case mem_shrink_at:
- if (match_int(args, &val) || (val <= 0)) {
- err = -EINVAL;
- } else {
- ckrm_mem_shrink_at = val;
- }
- break;
- case mem_shrink_to:
- if (match_int(args, &val) || (val < 0) || (val > 100)) {
- err = -EINVAL;
- } else {
- ckrm_mem_shrink_to = val;
- }
- break;
- case mem_shrink_count:
- if (match_int(args, &val) || (val <= 0)) {
- err = -EINVAL;
- } else {
- ckrm_mem_shrink_count = val;
- }
- break;
- case mem_shrink_interval:
- if (match_int(args, &val) || (val <= 0)) {
- err = -EINVAL;
- } else {
- ckrm_mem_shrink_interval = val;
- }
- break;
- default:
- err = -EINVAL;
- }
- }
- return err;
-}
-
-static int
-mem_reset_stats(void *my_res)
-{
- struct ckrm_mem_res *res = my_res;
- printk(KERN_INFO "MEM_RC: reset stats called for class %s\n",
- res->core->name);
- return 0;
-}
-
-struct ckrm_res_ctlr mem_rcbs = {
- .res_name = MEM_RES_NAME,
- .res_hdepth = CKRM_MEM_MAX_HIERARCHY,
- .resid = -1,
- .res_alloc = mem_res_alloc,
- .res_free = mem_res_free,
- .set_share_values = mem_set_share_values,
- .get_share_values = mem_get_share_values,
- .get_stats = mem_get_stats,
- .change_resclass = mem_change_resclass,
- .show_config = mem_show_config,
- .set_config = mem_set_config,
- .reset_stats = mem_reset_stats,
-};
-
-EXPORT_SYMBOL_GPL(mem_rcbs);
-
-int __init
-init_ckrm_mem_res(void)
-{
- struct ckrm_classtype *clstype;
- int resid = mem_rcbs.resid;
-
- set_ckrm_tot_pages();
- spin_lock_init(&ckrm_mem_lock);
- clstype = ckrm_find_classtype_by_name("taskclass");
- if (clstype == NULL) {
- printk(KERN_INFO " Unknown ckrm classtype<taskclass>");
- return -ENOENT;
- }
-
- if (resid == -1) {
- resid = ckrm_register_res_ctlr(clstype, &mem_rcbs);
- if (resid != -1) {
- mem_rcbs.classtype = clstype;
- }
- }
- return ((resid < 0) ? resid : 0);
-}
-
-void __exit
-exit_ckrm_mem_res(void)
-{
- ckrm_unregister_res_ctlr(&mem_rcbs);
- mem_rcbs.resid = -1;
-}
-
-module_init(init_ckrm_mem_res)
-module_exit(exit_ckrm_mem_res)
-MODULE_LICENSE("GPL");
+++ /dev/null
-/* ckrm_memctlr.c - Basic routines for the CKRM memory controller
- *
- * Copyright (C) Jiantao Kong, IBM Corp. 2003
- * (C) Chandra Seetharaman, IBM Corp. 2004
- *
- * Provides a Memory Resource controller for CKRM
- *
- * Latest version, more details at http://ckrm.sf.net
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- */
-
-#include <linux/swap.h>
-#include <linux/pagemap.h>
-#include <linux/ckrm_mem_inline.h>
-
-static int
-ckrm_mem_evaluate_page_anon(struct page* page)
-{
- struct ckrm_mem_res* pgcls = page_ckrmzone(page)->memcls;
- struct ckrm_mem_res* maxshareclass = NULL;
- struct anon_vma *anon_vma = (struct anon_vma *) page->mapping;
- struct vm_area_struct *vma;
- struct mm_struct* mm;
- int ret = 0;
-
- if (!spin_trylock(&anon_vma->lock))
- return 0;
- BUG_ON(list_empty(&anon_vma->head));
- list_for_each_entry(vma, &anon_vma->head, anon_vma_node) {
- mm = vma->vm_mm;
- if (!maxshareclass || ckrm_mem_share_compare(maxshareclass,
- mm->memclass) < 0) {
- maxshareclass = mm->memclass;
- }
- }
- spin_unlock(&anon_vma->lock);
-
- if (!maxshareclass) {
- maxshareclass = ckrm_mem_root_class;
- }
- if (pgcls != maxshareclass) {
- ckrm_change_page_class(page, maxshareclass);
- ret = 1;
- }
- return ret;
-}
-
-static int
-ckrm_mem_evaluate_page_file(struct page* page)
-{
- struct ckrm_mem_res* pgcls = page_ckrmzone(page)->memcls;
- struct ckrm_mem_res* maxshareclass = NULL;
- struct address_space *mapping = page->mapping;
- struct vm_area_struct *vma = NULL;
- pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
- struct prio_tree_iter iter;
- struct mm_struct* mm;
- int ret = 0;
-
- if (!mapping)
- return 0;
-
- if (!spin_trylock(&mapping->i_mmap_lock))
- return 0;
-
- vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap,
- pgoff, pgoff) {
- mm = vma->vm_mm;
- if (!maxshareclass || ckrm_mem_share_compare(maxshareclass,
- mm->memclass)<0)
- maxshareclass = mm->memclass;
- }
- spin_unlock(&mapping->i_mmap_lock);
-
- if (!maxshareclass) {
- maxshareclass = ckrm_mem_root_class;
- }
- if (pgcls != maxshareclass) {
- ckrm_change_page_class(page, maxshareclass);
- ret = 1;
- }
- return ret;
-}
-
-static int
-ckrm_mem_evaluate_page(struct page* page)
-{
- int ret = 0;
- if (page->mapping) {
- if (PageAnon(page))
- ret = ckrm_mem_evaluate_page_anon(page);
- else
- ret = ckrm_mem_evaluate_page_file(page);
- }
- return ret;
-}
-
-void
-ckrm_mem_migrate_all_pages(struct ckrm_mem_res* from, struct ckrm_mem_res* def)
-{
- int i;
- struct page *page;
- struct zone *zone;
- struct list_head *pos, *next;
- struct ckrm_zone *ckrm_zone;
-
- for (i = 0; i < MAX_NR_ZONES; i++) {
- ckrm_zone = &from->ckrm_zone[i];
- zone = ckrm_zone->zone;
- spin_lock_irq(&zone->lru_lock);
- pos = ckrm_zone->inactive_list.next;
- while (pos != &ckrm_zone->inactive_list) {
- next = pos->next;
- page = list_entry(pos, struct page, lru);
- if (ckrm_mem_evaluate_page(page))
- ckrm_change_page_class(page, def);
- pos = next;
- }
- pos = ckrm_zone->active_list.next;
- while (pos != &ckrm_zone->active_list) {
- next = pos->next;
- page = list_entry(pos, struct page, lru);
- if (ckrm_mem_evaluate_page(page))
- ckrm_change_page_class(page, def);
- pos = next;
- }
- spin_unlock_irq(&zone->lru_lock);
- }
- return;
-}
-
-static inline int
-class_migrate_pmd(struct mm_struct* mm, struct vm_area_struct* vma,
- pmd_t* pmdir, unsigned long address, unsigned long end)
-{
- pte_t *pte;
- unsigned long pmd_end;
-
- if (pmd_none(*pmdir))
- return 0;
- BUG_ON(pmd_bad(*pmdir));
-
- pmd_end = (address+ PMD_SIZE) & PMD_MASK;
- if (end > pmd_end)
- end = pmd_end;
-
- do {
- pte = pte_offset_map(pmdir, address);
- if (pte_present(*pte)) {
- struct page *page = pte_page(*pte);
- struct ckrm_zone *czone = page_ckrmzone(page);
- if (page->mapping && czone) {
- struct zone *zone = czone->zone;
- spin_lock_irq(&zone->lru_lock);
- ckrm_change_page_class(page, mm->memclass);
- spin_unlock_irq(&zone->lru_lock);
- }
- }
- address += PAGE_SIZE;
- pte_unmap(pte);
- pte++;
- } while(address && (address < end));
- return 0;
-}
-
-static inline int
-class_migrate_pgd(struct mm_struct* mm, struct vm_area_struct* vma,
- pgd_t* pgdir, unsigned long address, unsigned long end)
-{
- pmd_t* pmd;
- unsigned long pgd_end;
-
- if (pgd_none(*pgdir))
- return 0;
- BUG_ON(pgd_bad(*pgdir));
-
- pmd = pmd_offset(pgdir, address);
- pgd_end = (address + PGDIR_SIZE) & PGDIR_MASK;
-
- if (pgd_end && (end > pgd_end))
- end = pgd_end;
-
- do {
- class_migrate_pmd(mm, vma, pmd, address, end);
- address = (address + PMD_SIZE) & PMD_MASK;
- pmd++;
- } while (address && (address < end));
- return 0;
-}
-
-static inline int
-class_migrate_vma(struct mm_struct* mm, struct vm_area_struct* vma)
-{
- pgd_t* pgdir;
- unsigned long address, end;
-
- address = vma->vm_start;
- end = vma->vm_end;
-
- pgdir = pgd_offset(vma->vm_mm, address);
- do {
- class_migrate_pgd(mm, vma, pgdir, address, end);
- address = (address + PGDIR_SIZE) & PGDIR_MASK;
- pgdir++;
- } while(address && (address < end));
- return 0;
-}
-
-/* this function is called with mm->peertask_lock hold */
-void
-ckrm_mem_migrate_mm(struct mm_struct* mm, struct ckrm_mem_res *def)
-{
- struct task_struct *task;
- struct vm_area_struct *vma;
- struct ckrm_mem_res *maxshareclass = def;
-
- if (list_empty(&mm->tasklist)) {
- /* We leave the mm->memclass untouched since we believe that one
- * mm with no task associated will be deleted soon or attach
- * with another task later.
- */
- return;
- }
-
- list_for_each_entry(task, &mm->tasklist, mm_peers) {
- struct ckrm_mem_res* cls = ckrm_get_mem_class(task);
- if (!cls)
- continue;
- if (!maxshareclass ||
- ckrm_mem_share_compare(maxshareclass,cls)<0 )
- maxshareclass = cls;
- }
-
- if (maxshareclass && (mm->memclass != maxshareclass)) {
- if (mm->memclass) {
- kref_put(&mm->memclass->nr_users, memclass_release);
- }
- mm->memclass = maxshareclass;
- kref_get(&maxshareclass->nr_users);
-
- /* Go through all VMA to migrate pages */
- down_read(&mm->mmap_sem);
- vma = mm->mmap;
- while(vma) {
- class_migrate_vma(mm, vma);
- vma = vma->vm_next;
- }
- up_read(&mm->mmap_sem);
- }
- return;
-}
-
-static int
-shrink_weight(struct ckrm_zone *czone)
-{
- u64 temp;
- struct zone *zone = czone->zone;
- struct ckrm_mem_res *cls = czone->memcls;
- int zone_usage, zone_guar, zone_total, guar, ret, cnt;
-
- zone_usage = czone->nr_active + czone->nr_inactive;
- czone->active_over = czone->inactive_over = 0;
-
- if (zone_usage < SWAP_CLUSTER_MAX * 4)
- return 0;
-
- if (cls->pg_guar == CKRM_SHARE_DONTCARE) {
- // no guarantee for this class. use implicit guarantee
- guar = cls->impl_guar / cls->nr_dontcare;
- } else {
- guar = cls->pg_unused / cls->nr_dontcare;
- }
- zone_total = zone->nr_active + zone->nr_inactive + zone->free_pages;
- temp = (u64) guar * zone_total;
- do_div(temp, ckrm_tot_lru_pages);
- zone_guar = (int) temp;
-
- ret = ((zone_usage - zone_guar) > SWAP_CLUSTER_MAX) ?
- (zone_usage - zone_guar) : 0;
- if (ret) {
- cnt = czone->nr_active - (2 * zone_guar / 3);
- if (cnt > 0)
- czone->active_over = cnt;
- cnt = czone->active_over + czone->nr_inactive
- - zone_guar / 3;
- if (cnt > 0)
- czone->inactive_over = cnt;
- }
- return ret;
-}
-
-/* insert an entry to the list and sort decendently*/
-static void
-list_add_sort(struct list_head *entry, struct list_head *head)
-{
- struct ckrm_zone *czone, *new =
- list_entry(entry, struct ckrm_zone, victim_list);
- struct list_head* pos = head->next;
-
- while (pos != head) {
- czone = list_entry(pos, struct ckrm_zone, victim_list);
- if (new->shrink_weight > czone->shrink_weight) {
- __list_add(entry, pos->prev, pos);
- return;
- }
- pos = pos->next;
- }
- list_add_tail(entry, head);
- return;
-}
-
-static void
-shrink_choose_victims(struct list_head *victims,
- unsigned long nr_active, unsigned long nr_inactive)
-{
- unsigned long nr;
- struct ckrm_zone* czone;
- struct list_head *pos, *next;
-
- pos = victims->next;
- while ((pos != victims) && (nr_active || nr_inactive)) {
- czone = list_entry(pos, struct ckrm_zone, victim_list);
-
- if (nr_active && czone->active_over) {
- nr = min(nr_active, czone->active_over);
- czone->shrink_active += nr;
- czone->active_over -= nr;
- nr_active -= nr;
- }
-
- if (nr_inactive && czone->inactive_over) {
- nr = min(nr_inactive, czone->inactive_over);
- czone->shrink_inactive += nr;
- czone->inactive_over -= nr;
- nr_inactive -= nr;
- }
- pos = pos->next;
- }
-
- pos = victims->next;
- while (pos != victims) {
- czone = list_entry(pos, struct ckrm_zone, victim_list);
- next = pos->next;
- if (czone->shrink_active == 0 && czone->shrink_inactive == 0) {
- list_del_init(pos);
- ckrm_clear_shrink(czone);
- }
- pos = next;
- }
- return;
-}
-
-void
-shrink_get_victims(struct zone *zone, unsigned long nr_active,
- unsigned long nr_inactive, struct list_head *victims)
-{
- struct list_head *pos;
- struct ckrm_mem_res *cls;
- struct ckrm_zone *czone;
- int zoneindex = zone_idx(zone);
-
- if (ckrm_nr_mem_classes <= 1) {
- if (ckrm_mem_root_class) {
- czone = ckrm_mem_root_class->ckrm_zone + zoneindex;
- if (!ckrm_test_set_shrink(czone)) {
- list_add(&czone->victim_list, victims);
- czone->shrink_active = nr_active;
- czone->shrink_inactive = nr_inactive;
- }
- }
- return;
- }
- spin_lock(&ckrm_mem_lock);
- list_for_each_entry(cls, &ckrm_memclass_list, mcls_list) {
- czone = cls->ckrm_zone + zoneindex;
- if (ckrm_test_set_shrink(czone))
- continue;
-
- czone->shrink_active = 0;
- czone->shrink_inactive = 0;
- czone->shrink_weight = shrink_weight(czone);
- if (czone->shrink_weight) {
- list_add_sort(&czone->victim_list, victims);
- } else {
- ckrm_clear_shrink(czone);
- }
- }
- pos = victims->next;
- while (pos != victims) {
- czone = list_entry(pos, struct ckrm_zone, victim_list);
- pos = pos->next;
- }
- shrink_choose_victims(victims, nr_active, nr_inactive);
- spin_unlock(&ckrm_mem_lock);
- pos = victims->next;
- while (pos != victims) {
- czone = list_entry(pos, struct ckrm_zone, victim_list);
- pos = pos->next;
- }
-}
-
-LIST_HEAD(ckrm_shrink_list);
-void
-ckrm_shrink_atlimit(struct ckrm_mem_res *cls)
-{
- struct zone *zone;
- unsigned long now = jiffies;
- int order;
-
- if (!cls || (cls->pg_limit == CKRM_SHARE_DONTCARE) ||
- ((cls->flags & CLS_AT_LIMIT) == CLS_AT_LIMIT)) {
- return;
- }
- if ((cls->last_shrink > now) /* jiffies wrapped around */ ||
- (cls->last_shrink + (ckrm_mem_shrink_interval * HZ)) < now) {
- cls->last_shrink = now;
- cls->shrink_count = 0;
- }
- cls->shrink_count++;
- if (cls->shrink_count > ckrm_mem_shrink_count) {
- return;
- }
- spin_lock(&ckrm_mem_lock);
- list_add(&cls->shrink_list, &ckrm_shrink_list);
- spin_unlock(&ckrm_mem_lock);
- cls->flags |= CLS_AT_LIMIT;
- for_each_zone(zone) {
- /* This is just a number to get to wakeup kswapd */
- order = atomic_read(&cls->pg_total) -
- ((ckrm_mem_shrink_to * cls->pg_limit) / 100);
- wakeup_kswapd(zone);
- break; // only once is enough
- }
-}
*
*/
+/* Changes
+ *
+ * 31 Mar 2004: Created
+ *
+ */
+
/*
- * CKRM Resource controller for tracking number of tasks in a class.
+ * Code Description: TBD
*/
#include <linux/module.h>
#include <asm/div64.h>
#include <linux/list.h>
#include <linux/spinlock.h>
+#include <linux/parser.h>
#include <linux/ckrm_rc.h>
#include <linux/ckrm_tc.h>
#include <linux/ckrm_tsk.h>
-#define TOTAL_NUM_TASKS (131072) /* 128 K */
+#define DEF_TOTAL_NUM_TASKS (131072) // 128 K
+#define DEF_FORKRATE (1000000) // 1 million tasks
+#define DEF_FORKRATE_INTERVAL (3600) // per hour
#define NUMTASKS_DEBUG
#define NUMTASKS_NAME "numtasks"
-
-struct ckrm_numtasks {
- struct ckrm_core_class *core; /* the core i am part of... */
- struct ckrm_core_class *parent; /* parent of the core above. */
+#define SYS_TOTAL_TASKS "sys_total_tasks"
+#define FORKRATE "forkrate"
+#define FORKRATE_INTERVAL "forkrate_interval"
+
+static int total_numtasks = DEF_TOTAL_NUM_TASKS;
+static int total_cnt_alloc = 0;
+static int forkrate = DEF_FORKRATE;
+static int forkrate_interval = DEF_FORKRATE_INTERVAL;
+static ckrm_core_class_t *root_core;
+
+typedef struct ckrm_numtasks {
+ struct ckrm_core_class *core; // the core i am part of...
+ struct ckrm_core_class *parent; // parent of the core above.
struct ckrm_shares shares;
- spinlock_t cnt_lock; /* always grab parent's lock before child's */
- int cnt_guarantee; /* num_tasks guarantee in local units */
- int cnt_unused; /* has to borrow if more than this is needed */
- int cnt_limit; /* no tasks over this limit. */
- atomic_t cnt_cur_alloc; /* current alloc from self */
- atomic_t cnt_borrowed; /* borrowed from the parent */
-
- int over_guarantee; /* turn on/off when cur_alloc goes */
- /* over/under guarantee */
-
- /* internally maintained statictics to compare with max numbers */
- int limit_failures; /* # failures as request was over the limit */
- int borrow_sucesses; /* # successful borrows */
- int borrow_failures; /* # borrow failures */
-
- /* Maximum the specific statictics has reached. */
+ spinlock_t cnt_lock; // always grab parent's lock before child's
+ int cnt_guarantee; // num_tasks guarantee in local units
+ int cnt_unused; // has to borrow if more than this is needed
+ int cnt_limit; // no tasks over this limit.
+ atomic_t cnt_cur_alloc; // current alloc from self
+ atomic_t cnt_borrowed; // borrowed from the parent
+
+ int over_guarantee; // turn on/off when cur_alloc goes
+ // over/under guarantee
+
+ // internally maintained statictics to compare with max numbers
+ int limit_failures; // # failures as request was over the limit
+ int borrow_sucesses; // # successful borrows
+ int borrow_failures; // # borrow failures
+
+ // Maximum the specific statictics has reached.
int max_limit_failures;
int max_borrow_sucesses;
int max_borrow_failures;
- /* Total number of specific statistics */
+ // Total number of specific statistics
int tot_limit_failures;
int tot_borrow_sucesses;
int tot_borrow_failures;
-};
+
+ // fork rate fields
+ int forks_in_period;
+ unsigned long period_start;
+} ckrm_numtasks_t;
struct ckrm_res_ctlr numtasks_rcbs;
* to make share values sane.
* Does not traverse hierarchy reinitializing children.
*/
-static void numtasks_res_initcls_one(struct ckrm_numtasks * res)
+static void numtasks_res_initcls_one(ckrm_numtasks_t * res)
{
res->shares.my_guarantee = CKRM_SHARE_DONTCARE;
res->shares.my_limit = CKRM_SHARE_DONTCARE;
res->tot_borrow_sucesses = 0;
res->tot_borrow_failures = 0;
+ res->forks_in_period = 0;
+ res->period_start = jiffies;
+
atomic_set(&res->cnt_cur_alloc, 0);
atomic_set(&res->cnt_borrowed, 0);
return;
}
-static int numtasks_get_ref_local(struct ckrm_core_class *core, int force)
+#if 0
+static void numtasks_res_initcls(void *my_res)
{
- int rc, resid = numtasks_rcbs.resid;
- struct ckrm_numtasks *res;
+ ckrm_numtasks_t *res = my_res;
+
+ /* Write a version which propagates values all the way down
+ and replace rcbs callback with that version */
+
+}
+#endif
+
+static int numtasks_get_ref_local(void *arg, int force)
+{
+ int rc, resid = numtasks_rcbs.resid, borrowed = 0;
+ unsigned long now = jiffies, chg_at;
+ ckrm_numtasks_t *res;
+ ckrm_core_class_t *core = arg;
if ((resid < 0) || (core == NULL))
return 1;
- res = ckrm_get_res_class(core, resid, struct ckrm_numtasks);
+ res = ckrm_get_res_class(core, resid, ckrm_numtasks_t);
if (res == NULL)
return 1;
+ // force is not associated with fork. So, if force is specified
+ // we don't have to bother about forkrate.
+ if (!force) {
+ // Take care of wraparound situation
+ chg_at = res->period_start + forkrate_interval * HZ;
+ if (chg_at < res->period_start) {
+ chg_at += forkrate_interval * HZ;
+ now += forkrate_interval * HZ;
+ }
+ if (chg_at <= now) {
+ res->period_start = now;
+ res->forks_in_period = 0;
+ }
+
+ if (res->forks_in_period >= forkrate) {
+ return 0;
+ }
+ }
+
atomic_inc(&res->cnt_cur_alloc);
rc = 1;
res->borrow_sucesses++;
res->tot_borrow_sucesses++;
res->over_guarantee = 1;
+ borrowed++;
} else {
res->borrow_failures++;
res->tot_borrow_failures++;
}
- } else
+ } else {
rc = force;
+ }
} else if (res->over_guarantee) {
res->over_guarantee = 0;
- if (res->max_limit_failures < res->limit_failures)
+ if (res->max_limit_failures < res->limit_failures) {
res->max_limit_failures = res->limit_failures;
- if (res->max_borrow_sucesses < res->borrow_sucesses)
+ }
+ if (res->max_borrow_sucesses < res->borrow_sucesses) {
res->max_borrow_sucesses = res->borrow_sucesses;
- if (res->max_borrow_failures < res->borrow_failures)
+ }
+ if (res->max_borrow_failures < res->borrow_failures) {
res->max_borrow_failures = res->borrow_failures;
+ }
res->limit_failures = 0;
res->borrow_sucesses = 0;
res->borrow_failures = 0;
}
- if (!rc)
+ if (!rc) {
atomic_dec(&res->cnt_cur_alloc);
+ } else if (!borrowed) {
+ total_cnt_alloc++;
+ if (!force) { // force is not associated with a real fork.
+ res->forks_in_period++;
+ }
+ }
return rc;
}
-static void numtasks_put_ref_local(struct ckrm_core_class *core)
+static void numtasks_put_ref_local(void *arg)
{
int resid = numtasks_rcbs.resid;
- struct ckrm_numtasks *res;
+ ckrm_numtasks_t *res;
+ ckrm_core_class_t *core = arg;
- if ((resid == -1) || (core == NULL))
+ if ((resid == -1) || (core == NULL)) {
return;
+ }
- res = ckrm_get_res_class(core, resid, struct ckrm_numtasks);
+ res = ckrm_get_res_class(core, resid, ckrm_numtasks_t);
if (res == NULL)
return;
-
- if (atomic_read(&res->cnt_cur_alloc)==0)
+ if (unlikely(atomic_read(&res->cnt_cur_alloc) == 0)) {
+ printk(KERN_WARNING "numtasks_put_ref: Trying to decrement "
+ "counter below 0\n");
return;
-
+ }
atomic_dec(&res->cnt_cur_alloc);
-
if (atomic_read(&res->cnt_borrowed) > 0) {
atomic_dec(&res->cnt_borrowed);
numtasks_put_ref_local(res->parent);
+ } else {
+ total_cnt_alloc--;
}
+
return;
}
static void *numtasks_res_alloc(struct ckrm_core_class *core,
struct ckrm_core_class *parent)
{
- struct ckrm_numtasks *res;
+ ckrm_numtasks_t *res;
- res = kmalloc(sizeof(struct ckrm_numtasks), GFP_ATOMIC);
+ res = kmalloc(sizeof(ckrm_numtasks_t), GFP_ATOMIC);
if (res) {
- memset(res, 0, sizeof(struct ckrm_numtasks));
+ memset(res, 0, sizeof(ckrm_numtasks_t));
res->core = core;
res->parent = parent;
numtasks_res_initcls_one(res);
res->cnt_lock = SPIN_LOCK_UNLOCKED;
if (parent == NULL) {
- /*
- * I am part of root class. So set the max tasks
- * to available default.
- */
- res->cnt_guarantee = TOTAL_NUM_TASKS;
- res->cnt_unused = TOTAL_NUM_TASKS;
- res->cnt_limit = TOTAL_NUM_TASKS;
+ // I am part of root class. So set the max tasks
+ // to available default
+ res->cnt_guarantee = total_numtasks;
+ res->cnt_unused = total_numtasks;
+ res->cnt_limit = total_numtasks;
+ root_core = core; // store the root core.
}
try_module_get(THIS_MODULE);
} else {
*/
static void numtasks_res_free(void *my_res)
{
- struct ckrm_numtasks *res = my_res, *parres, *childres;
- struct ckrm_core_class *child = NULL;
+ ckrm_numtasks_t *res = my_res, *parres, *childres;
+ ckrm_core_class_t *child = NULL;
int i, borrowed, maxlimit, resid = numtasks_rcbs.resid;
if (!res)
return;
- /* Assuming there will be no children when this function is called */
+ // Assuming there will be no children when this function is called
- parres = ckrm_get_res_class(res->parent, resid, struct ckrm_numtasks);
+ parres = ckrm_get_res_class(res->parent, resid, ckrm_numtasks_t);
- if ((borrowed = atomic_read(&res->cnt_borrowed)) > 0)
- for (i = 0; i < borrowed; i++)
- numtasks_put_ref_local(parres->core);
-
- /* return child's limit/guarantee to parent node */
+ if (unlikely(atomic_read(&res->cnt_cur_alloc) < 0)) {
+ printk(KERN_WARNING "numtasks_res: counter below 0\n");
+ }
+ if (unlikely(atomic_read(&res->cnt_cur_alloc) > 0 ||
+ atomic_read(&res->cnt_borrowed) > 0)) {
+ printk(KERN_WARNING "numtasks_res_free: resource still "
+ "alloc'd %p\n", res);
+ if ((borrowed = atomic_read(&res->cnt_borrowed)) > 0) {
+ for (i = 0; i < borrowed; i++) {
+ numtasks_put_ref_local(parres->core);
+ }
+ }
+ }
+ // return child's limit/guarantee to parent node
spin_lock(&parres->cnt_lock);
child_guarantee_changed(&parres->shares, res->shares.my_guarantee, 0);
- /* run thru parent's children and get the new max_limit of the parent */
+ // run thru parent's children and get the new max_limit of the parent
ckrm_lock_hier(parres->core);
maxlimit = 0;
while ((child = ckrm_get_next_child(parres->core, child)) != NULL) {
- childres = ckrm_get_res_class(child, resid, struct ckrm_numtasks);
- if (maxlimit < childres->shares.my_limit)
+ childres = ckrm_get_res_class(child, resid, ckrm_numtasks_t);
+ if (maxlimit < childres->shares.my_limit) {
maxlimit = childres->shares.my_limit;
+ }
}
ckrm_unlock_hier(parres->core);
- if (parres->shares.cur_max_limit < maxlimit)
+ if (parres->shares.cur_max_limit < maxlimit) {
parres->shares.cur_max_limit = maxlimit;
+ }
spin_unlock(&parres->cnt_lock);
kfree(res);
return;
}
+
/*
* Recalculate the guarantee and limit in real units... and propagate the
* same to children.
* Caller is responsible for protecting res and for the integrity of parres
*/
static void
-recalc_and_propagate(struct ckrm_numtasks * res, struct ckrm_numtasks * parres)
+recalc_and_propagate(ckrm_numtasks_t * res, ckrm_numtasks_t * parres)
{
- struct ckrm_core_class *child = NULL;
- struct ckrm_numtasks *childres;
+ ckrm_core_class_t *child = NULL;
+ ckrm_numtasks_t *childres;
int resid = numtasks_rcbs.resid;
if (parres) {
struct ckrm_shares *par = &parres->shares;
struct ckrm_shares *self = &res->shares;
- /* calculate cnt_guarantee and cnt_limit */
- if ((parres->cnt_guarantee == CKRM_SHARE_DONTCARE) ||
- (self->my_guarantee == CKRM_SHARE_DONTCARE))
+ // calculate cnt_guarantee and cnt_limit
+ //
+ if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) {
res->cnt_guarantee = CKRM_SHARE_DONTCARE;
- else if (par->total_guarantee) {
+ } else if (par->total_guarantee) {
u64 temp = (u64) self->my_guarantee * parres->cnt_guarantee;
do_div(temp, par->total_guarantee);
res->cnt_guarantee = (int) temp;
- } else
+ } else {
res->cnt_guarantee = 0;
+ }
- if ((parres->cnt_limit == CKRM_SHARE_DONTCARE) ||
- (self->my_limit == CKRM_SHARE_DONTCARE))
+ if (parres->cnt_limit == CKRM_SHARE_DONTCARE) {
res->cnt_limit = CKRM_SHARE_DONTCARE;
- else if (par->max_limit) {
+ } else if (par->max_limit) {
u64 temp = (u64) self->my_limit * parres->cnt_limit;
do_div(temp, par->max_limit);
res->cnt_limit = (int) temp;
- } else
+ } else {
res->cnt_limit = 0;
+ }
- /* Calculate unused units */
- if ((res->cnt_guarantee == CKRM_SHARE_DONTCARE) ||
- (self->my_guarantee == CKRM_SHARE_DONTCARE))
+ // Calculate unused units
+ if (res->cnt_guarantee == CKRM_SHARE_DONTCARE) {
res->cnt_unused = CKRM_SHARE_DONTCARE;
- else if (self->total_guarantee) {
+ } else if (self->total_guarantee) {
u64 temp = (u64) self->unused_guarantee * res->cnt_guarantee;
do_div(temp, self->total_guarantee);
res->cnt_unused = (int) temp;
- } else
+ } else {
res->cnt_unused = 0;
+ }
}
-
- /* propagate to children */
+ // propagate to children
ckrm_lock_hier(res->core);
while ((child = ckrm_get_next_child(res->core, child)) != NULL) {
- childres = ckrm_get_res_class(child, resid, struct ckrm_numtasks);
-
- spin_lock(&childres->cnt_lock);
- recalc_and_propagate(childres, res);
- spin_unlock(&childres->cnt_lock);
+ childres = ckrm_get_res_class(child, resid, ckrm_numtasks_t);
+ if (childres) {
+ spin_lock(&childres->cnt_lock);
+ recalc_and_propagate(childres, res);
+ spin_unlock(&childres->cnt_lock);
+ } else {
+ printk(KERN_ERR "%s: numtasks resclass missing\n",__FUNCTION__);
+ }
}
ckrm_unlock_hier(res->core);
return;
static int numtasks_set_share_values(void *my_res, struct ckrm_shares *new)
{
- struct ckrm_numtasks *parres, *res = my_res;
+ ckrm_numtasks_t *parres, *res = my_res;
struct ckrm_shares *cur = &res->shares, *par;
int rc = -EINVAL, resid = numtasks_rcbs.resid;
if (res->parent) {
parres =
- ckrm_get_res_class(res->parent, resid, struct ckrm_numtasks);
+ ckrm_get_res_class(res->parent, resid, ckrm_numtasks_t);
spin_lock(&parres->cnt_lock);
spin_lock(&res->cnt_lock);
par = &parres->shares;
rc = set_shares(new, cur, par);
if ((rc == 0) && parres) {
- /* Calculate parent's unused units */
- if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE)
+ // Calculate parent's unused units
+ if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) {
parres->cnt_unused = CKRM_SHARE_DONTCARE;
- else if (par->total_guarantee) {
+ } else if (par->total_guarantee) {
u64 temp = (u64) par->unused_guarantee * parres->cnt_guarantee;
do_div(temp, par->total_guarantee);
parres->cnt_unused = (int) temp;
- } else
+ } else {
parres->cnt_unused = 0;
+ }
recalc_and_propagate(res, parres);
}
spin_unlock(&res->cnt_lock);
- if (res->parent)
+ if (res->parent) {
spin_unlock(&parres->cnt_lock);
+ }
return rc;
}
static int numtasks_get_share_values(void *my_res, struct ckrm_shares *shares)
{
- struct ckrm_numtasks *res = my_res;
+ ckrm_numtasks_t *res = my_res;
if (!res)
return -EINVAL;
static int numtasks_get_stats(void *my_res, struct seq_file *sfile)
{
- struct ckrm_numtasks *res = my_res;
+ ckrm_numtasks_t *res = my_res;
if (!res)
return -EINVAL;
- seq_printf(sfile, "---------Number of tasks stats start---------\n");
+ seq_printf(sfile, "Number of tasks resource:\n");
seq_printf(sfile, "Total Over limit failures: %d\n",
res->tot_limit_failures);
seq_printf(sfile, "Total Over guarantee sucesses: %d\n",
res->max_borrow_sucesses);
seq_printf(sfile, "Maximum Over guarantee failures: %d\n",
res->max_borrow_failures);
- seq_printf(sfile, "---------Number of tasks stats end---------\n");
#ifdef NUMTASKS_DEBUG
seq_printf(sfile,
"cur_alloc %d; borrowed %d; cnt_guar %d; cnt_limit %d "
static int numtasks_show_config(void *my_res, struct seq_file *sfile)
{
- struct ckrm_numtasks *res = my_res;
+ ckrm_numtasks_t *res = my_res;
if (!res)
return -EINVAL;
- seq_printf(sfile, "res=%s,parameter=somevalue\n", NUMTASKS_NAME);
+ seq_printf(sfile, "res=%s,%s=%d,%s=%d,%s=%d\n", NUMTASKS_NAME,
+ SYS_TOTAL_TASKS, total_numtasks,
+ FORKRATE, forkrate,
+ FORKRATE_INTERVAL, forkrate_interval);
return 0;
}
+enum numtasks_token_t {
+ numtasks_token_total,
+ numtasks_token_forkrate,
+ numtasks_token_interval,
+ numtasks_token_err
+};
+
+static match_table_t numtasks_tokens = {
+ {numtasks_token_total, SYS_TOTAL_TASKS "=%d"},
+ {numtasks_token_forkrate, FORKRATE "=%d"},
+ {numtasks_token_interval, FORKRATE_INTERVAL "=%d"},
+ {numtasks_token_err, NULL},
+};
+
+static void reset_forkrates(ckrm_core_class_t *parent, unsigned long now)
+{
+ ckrm_numtasks_t *parres;
+ ckrm_core_class_t *child = NULL;
+
+ parres = ckrm_get_res_class(parent, numtasks_rcbs.resid,
+ ckrm_numtasks_t);
+ if (!parres) {
+ return;
+ }
+ parres->forks_in_period = 0;
+ parres->period_start = now;
+
+ ckrm_lock_hier(parent);
+ while ((child = ckrm_get_next_child(parent, child)) != NULL) {
+ reset_forkrates(child, now);
+ }
+ ckrm_unlock_hier(parent);
+}
+
static int numtasks_set_config(void *my_res, const char *cfgstr)
{
- struct ckrm_numtasks *res = my_res;
+ char *p;
+ ckrm_numtasks_t *res = my_res;
+ int new_total, fr = 0, itvl = 0, err = 0;
if (!res)
return -EINVAL;
- printk("numtasks config='%s'\n", cfgstr);
- return 0;
+
+ while ((p = strsep((char**)&cfgstr, ",")) != NULL) {
+ substring_t args[MAX_OPT_ARGS];
+ int token;
+ if (!*p)
+ continue;
+
+ token = match_token(p, numtasks_tokens, args);
+ switch (token) {
+ case numtasks_token_total:
+ if (match_int(args, &new_total) ||
+ (new_total < total_cnt_alloc)) {
+ err = -EINVAL;
+ } else {
+ total_numtasks = new_total;
+
+ // res is the default class, as config is present only
+ // in that directory
+ spin_lock(&res->cnt_lock);
+ res->cnt_guarantee = total_numtasks;
+ res->cnt_unused = total_numtasks;
+ res->cnt_limit = total_numtasks;
+ recalc_and_propagate(res, NULL);
+ spin_unlock(&res->cnt_lock);
+ }
+ break;
+ case numtasks_token_forkrate:
+ if (match_int(args, &fr) || (fr <= 0)) {
+ err = -EINVAL;
+ } else {
+ forkrate = fr;
+ }
+ break;
+ case numtasks_token_interval:
+ if (match_int(args, &itvl) || (itvl <= 0)) {
+ err = -EINVAL;
+ } else {
+ forkrate_interval = itvl;
+ }
+ break;
+ default:
+ err = -EINVAL;
+ }
+ }
+ if ((fr > 0) || (itvl > 0)) {
+ reset_forkrates(root_core, jiffies);
+ }
+ return err;
}
static void numtasks_change_resclass(void *task, void *old, void *new)
{
- struct ckrm_numtasks *oldres = old;
- struct ckrm_numtasks *newres = new;
+ ckrm_numtasks_t *oldres = old;
+ ckrm_numtasks_t *newres = new;
if (oldres != (void *)-1) {
struct task_struct *tsk = task;
&(tsk->parent->taskclass->core);
oldres =
ckrm_get_res_class(old_core, numtasks_rcbs.resid,
- struct ckrm_numtasks);
+ ckrm_numtasks_t);
}
- if (oldres)
- numtasks_put_ref_local(oldres->core);
+ numtasks_put_ref_local(oldres->core);
}
- if (newres)
+ if (newres) {
(void)numtasks_get_ref_local(newres->core, 1);
+ }
}
struct ckrm_res_ctlr numtasks_rcbs = {
if (resid == -1) {
resid = ckrm_register_res_ctlr(clstype, &numtasks_rcbs);
- printk("........init_ckrm_numtasks_res -> %d\n", resid);
+ printk(KERN_DEBUG "........init_ckrm_numtasks_res -> %d\n", resid);
if (resid != -1) {
ckrm_numtasks_register(numtasks_get_ref_local,
numtasks_put_ref_local);
void __exit exit_ckrm_numtasks_res(void)
{
- if (numtasks_rcbs.resid != -1)
+ if (numtasks_rcbs.resid != -1) {
ckrm_numtasks_register(NULL, NULL);
+ }
ckrm_unregister_res_ctlr(&numtasks_rcbs);
numtasks_rcbs.resid = -1;
}
module_init(init_ckrm_numtasks_res)
-module_exit(exit_ckrm_numtasks_res)
+ module_exit(exit_ckrm_numtasks_res)
-MODULE_LICENSE("GPL");
+ MODULE_LICENSE("GPL");
*
*/
+/* Changes
+ *
+ * 16 May 2004: Created
+ *
+ */
+
#include <linux/spinlock.h>
#include <linux/module.h>
#include <linux/ckrm_tsk.h>
spin_unlock(&stub_lock);
}
-int numtasks_get_ref(struct ckrm_core_class *arg, int force)
+int numtasks_get_ref(void *arg, int force)
{
int ret = 1;
spin_lock(&stub_lock);
return ret;
}
-void numtasks_put_ref(struct ckrm_core_class *arg)
+void numtasks_put_ref(void *arg)
{
spin_lock(&stub_lock);
if (real_put_ref) {
+++ /dev/null
-/*
- * kernel/crash.c - Memory preserving reboot related code.
- *
- * Created by: Hariprasad Nellitheertha (hari@in.ibm.com)
- * Copyright (C) IBM Corporation, 2004. All rights reserved
- */
-
-#include <linux/smp_lock.h>
-#include <linux/kexec.h>
-#include <linux/errno.h>
-#include <linux/proc_fs.h>
-#include <linux/bootmem.h>
-#include <linux/highmem.h>
-#include <linux/crash_dump.h>
-
-#include <asm/io.h>
-#include <asm/uaccess.h>
-
-#ifdef CONFIG_PROC_FS
-/*
- * Enable kexec reboot upon panic; for dumping
- */
-static ssize_t write_crash_dump_on(struct file *file, const char __user *buf,
- size_t count, loff_t *ppos)
-{
- if (count) {
- if (get_user(crash_dump_on, buf))
- return -EFAULT;
- }
- return count;
-}
-
-static struct file_operations proc_crash_dump_on_operations = {
- .write = write_crash_dump_on,
-};
-
-extern struct file_operations proc_vmcore_operations;
-extern struct proc_dir_entry *proc_vmcore;
-
-void crash_enable_by_proc(void)
-{
- struct proc_dir_entry *entry;
-
- entry = create_proc_entry("kexec-dump", S_IWUSR, NULL);
- if (entry)
- entry->proc_fops = &proc_crash_dump_on_operations;
-}
-
-void crash_create_proc_entry(void)
-{
- if (dump_enabled) {
- proc_vmcore = create_proc_entry("vmcore", S_IRUSR, NULL);
- if (proc_vmcore) {
- proc_vmcore->proc_fops = &proc_vmcore_operations;
- proc_vmcore->size =
- (size_t)(saved_max_pfn << PAGE_SHIFT);
- }
- }
-}
-
-#endif /* CONFIG_PROC_FS */
-
-void __crash_machine_kexec(void)
-{
- struct kimage *image;
-
- if ((!crash_dump_on) || (crashed))
- return;
-
- image = xchg(&kexec_crash_image, 0);
- if (image) {
- crashed = 1;
- printk(KERN_EMERG "kexec: opening parachute\n");
- crash_dump_stop_cpus();
- crash_dump_save_registers();
-
- /* If we are here to do a crash dump, save the memory from
- * 0-640k before we copy over the kexec kernel image. Otherwise
- * our dump will show the wrong kernel entirely.
- */
- crash_relocate_mem();
-
- machine_kexec(image);
- } else {
- printk(KERN_EMERG "kexec: No kernel image loaded!\n");
- }
-}
-
-/*
- * Copy a page from "oldmem". For this page, there is no pte mapped
- * in the current kernel. We stitch up a pte, similar to kmap_atomic.
- */
-ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
- size_t csize, int userbuf)
-{
- void *page, *vaddr;
-
- if (!csize)
- return 0;
-
- page = kmalloc(PAGE_SIZE, GFP_KERNEL);
-
- vaddr = kmap_atomic_pfn(pfn, KM_PTE0);
- copy_page(page, vaddr);
- kunmap_atomic(vaddr, KM_PTE0);
-
- if (userbuf) {
- if (copy_to_user(buf, page, csize)) {
- kfree(page);
- return -EFAULT;
- }
- } else
- memcpy(buf, page, csize);
- kfree(page);
-
- return 0;
-}
+++ /dev/null
-/*
- * kexec.c - kexec system call
- * Copyright (C) 2002-2004 Eric Biederman <ebiederm@xmission.com>
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#include <linux/mm.h>
-#include <linux/file.h>
-#include <linux/slab.h>
-#include <linux/fs.h>
-#include <linux/kexec.h>
-#include <linux/spinlock.h>
-#include <linux/list.h>
-#include <linux/highmem.h>
-#include <net/checksum.h>
-#include <asm/page.h>
-#include <asm/uaccess.h>
-#include <asm/io.h>
-#include <asm/system.h>
-
-/*
- * When kexec transitions to the new kernel there is a one-to-one
- * mapping between physical and virtual addresses. On processors
- * where you can disable the MMU this is trivial, and easy. For
- * others it is still a simple predictable page table to setup.
- *
- * In that environment kexec copies the new kernel to its final
- * resting place. This means I can only support memory whose
- * physical address can fit in an unsigned long. In particular
- * addresses where (pfn << PAGE_SHIFT) > ULONG_MAX cannot be handled.
- * If the assembly stub has more restrictive requirements
- * KEXEC_SOURCE_MEMORY_LIMIT and KEXEC_DEST_MEMORY_LIMIT can be
- * defined more restrictively in <asm/kexec.h>.
- *
- * The code for the transition from the current kernel to the
- * the new kernel is placed in the control_code_buffer, whose size
- * is given by KEXEC_CONTROL_CODE_SIZE. In the best case only a single
- * page of memory is necessary, but some architectures require more.
- * Because this memory must be identity mapped in the transition from
- * virtual to physical addresses it must live in the range
- * 0 - TASK_SIZE, as only the user space mappings are arbitrarily
- * modifiable.
- *
- * The assembly stub in the control code buffer is passed a linked list
- * of descriptor pages detailing the source pages of the new kernel,
- * and the destination addresses of those source pages. As this data
- * structure is not used in the context of the current OS, it must
- * be self-contained.
- *
- * The code has been made to work with highmem pages and will use a
- * destination page in its final resting place (if it happens
- * to allocate it). The end product of this is that most of the
- * physical address space, and most of RAM can be used.
- *
- * Future directions include:
- * - allocating a page table with the control code buffer identity
- * mapped, to simplify machine_kexec and make kexec_on_panic more
- * reliable.
- */
-
-/*
- * KIMAGE_NO_DEST is an impossible destination address..., for
- * allocating pages whose destination address we do not care about.
- */
-#define KIMAGE_NO_DEST (-1UL)
-
-static int kimage_is_destination_range(
- struct kimage *image, unsigned long start, unsigned long end);
-static struct page *kimage_alloc_page(struct kimage *image, unsigned int gfp_mask, unsigned long dest);
-
-
-static int kimage_alloc(struct kimage **rimage,
- unsigned long nr_segments, struct kexec_segment *segments)
-{
- int result;
- struct kimage *image;
- size_t segment_bytes;
- unsigned long i;
-
- /* Allocate a controlling structure */
- result = -ENOMEM;
- image = kmalloc(sizeof(*image), GFP_KERNEL);
- if (!image) {
- goto out;
- }
- memset(image, 0, sizeof(*image));
- image->head = 0;
- image->entry = &image->head;
- image->last_entry = &image->head;
-
- /* Initialize the list of control pages */
- INIT_LIST_HEAD(&image->control_pages);
-
- /* Initialize the list of destination pages */
- INIT_LIST_HEAD(&image->dest_pages);
-
- /* Initialize the list of unuseable pages */
- INIT_LIST_HEAD(&image->unuseable_pages);
-
- /* Read in the segments */
- image->nr_segments = nr_segments;
- segment_bytes = nr_segments * sizeof*segments;
- result = copy_from_user(image->segment, segments, segment_bytes);
- if (result)
- goto out;
-
- /*
- * Verify we have good destination addresses. The caller is
- * responsible for making certain we don't attempt to load
- * the new image into invalid or reserved areas of RAM. This
- * just verifies it is an address we can use.
- */
- result = -EADDRNOTAVAIL;
- for (i = 0; i < nr_segments; i++) {
- unsigned long mend;
- mend = ((unsigned long)(image->segment[i].mem)) +
- image->segment[i].memsz;
- if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT)
- goto out;
- }
-
- /*
- * Find a location for the control code buffer, and add it
- * the vector of segments so that it's pages will also be
- * counted as destination pages.
- */
- result = -ENOMEM;
- image->control_code_page = kimage_alloc_control_pages(image,
- get_order(KEXEC_CONTROL_CODE_SIZE));
- if (!image->control_code_page) {
- printk(KERN_ERR "Could not allocate control_code_buffer\n");
- goto out;
- }
-
- result = 0;
- out:
- if (result == 0) {
- *rimage = image;
- } else {
- kfree(image);
- }
- return result;
-}
-
-static int kimage_is_destination_range(
- struct kimage *image, unsigned long start, unsigned long end)
-{
- unsigned long i;
-
- for (i = 0; i < image->nr_segments; i++) {
- unsigned long mstart, mend;
- mstart = (unsigned long)image->segment[i].mem;
- mend = mstart + image->segment[i].memsz;
- if ((end > mstart) && (start < mend)) {
- return 1;
- }
- }
- return 0;
-}
-
-static struct page *kimage_alloc_pages(unsigned int gfp_mask, unsigned int order)
-{
- struct page *pages;
- pages = alloc_pages(gfp_mask, order);
- if (pages) {
- unsigned int count, i;
- pages->mapping = NULL;
- pages->private = order;
- count = 1 << order;
- for(i = 0; i < count; i++) {
- SetPageReserved(pages + i);
- }
- }
- return pages;
-}
-
-static void kimage_free_pages(struct page *page)
-{
- unsigned int order, count, i;
- order = page->private;
- count = 1 << order;
- for(i = 0; i < count; i++) {
- ClearPageReserved(page + i);
- }
- __free_pages(page, order);
-}
-
-static void kimage_free_page_list(struct list_head *list)
-{
- struct list_head *pos, *next;
- list_for_each_safe(pos, next, list) {
- struct page *page;
-
- page = list_entry(pos, struct page, lru);
- list_del(&page->lru);
-
- kimage_free_pages(page);
- }
-}
-
-struct page *kimage_alloc_control_pages(struct kimage *image, unsigned int order)
-{
- /* Control pages are special, they are the intermediaries
- * that are needed while we copy the rest of the pages
- * to their final resting place. As such they must
- * not conflict with either the destination addresses
- * or memory the kernel is already using.
- *
- * The only case where we really need more than one of
- * these are for architectures where we cannot disable
- * the MMU and must instead generate an identity mapped
- * page table for all of the memory.
- *
- * At worst this runs in O(N) of the image size.
- */
- struct list_head extra_pages;
- struct page *pages;
- unsigned int count;
-
- count = 1 << order;
- INIT_LIST_HEAD(&extra_pages);
-
- /* Loop while I can allocate a page and the page allocated
- * is a destination page.
- */
- do {
- unsigned long pfn, epfn, addr, eaddr;
- pages = kimage_alloc_pages(GFP_KERNEL, order);
- if (!pages)
- break;
- pfn = page_to_pfn(pages);
- epfn = pfn + count;
- addr = pfn << PAGE_SHIFT;
- eaddr = epfn << PAGE_SHIFT;
- if ((epfn >= (KEXEC_CONTROL_MEMORY_LIMIT >> PAGE_SHIFT)) ||
- kimage_is_destination_range(image, addr, eaddr))
- {
- list_add(&pages->lru, &extra_pages);
- pages = NULL;
- }
- } while(!pages);
- if (pages) {
- /* Remember the allocated page... */
- list_add(&pages->lru, &image->control_pages);
-
- /* Because the page is already in it's destination
- * location we will never allocate another page at
- * that address. Therefore kimage_alloc_pages
- * will not return it (again) and we don't need
- * to give it an entry in image->segment[].
- */
- }
- /* Deal with the destination pages I have inadvertently allocated.
- *
- * Ideally I would convert multi-page allocations into single
- * page allocations, and add everyting to image->dest_pages.
- *
- * For now it is simpler to just free the pages.
- */
- kimage_free_page_list(&extra_pages);
- return pages;
-
-}
-
-static int kimage_add_entry(struct kimage *image, kimage_entry_t entry)
-{
- if (*image->entry != 0) {
- image->entry++;
- }
- if (image->entry == image->last_entry) {
- kimage_entry_t *ind_page;
- struct page *page;
- page = kimage_alloc_page(image, GFP_KERNEL, KIMAGE_NO_DEST);
- if (!page) {
- return -ENOMEM;
- }
- ind_page = page_address(page);
- *image->entry = virt_to_phys(ind_page) | IND_INDIRECTION;
- image->entry = ind_page;
- image->last_entry =
- ind_page + ((PAGE_SIZE/sizeof(kimage_entry_t)) - 1);
- }
- *image->entry = entry;
- image->entry++;
- *image->entry = 0;
- return 0;
-}
-
-static int kimage_set_destination(
- struct kimage *image, unsigned long destination)
-{
- int result;
-
- destination &= PAGE_MASK;
- result = kimage_add_entry(image, destination | IND_DESTINATION);
- if (result == 0) {
- image->destination = destination;
- }
- return result;
-}
-
-
-static int kimage_add_page(struct kimage *image, unsigned long page)
-{
- int result;
-
- page &= PAGE_MASK;
- result = kimage_add_entry(image, page | IND_SOURCE);
- if (result == 0) {
- image->destination += PAGE_SIZE;
- }
- return result;
-}
-
-
-static void kimage_free_extra_pages(struct kimage *image)
-{
- /* Walk through and free any extra destination pages I may have */
- kimage_free_page_list(&image->dest_pages);
-
- /* Walk through and free any unuseable pages I have cached */
- kimage_free_page_list(&image->unuseable_pages);
-
-}
-static int kimage_terminate(struct kimage *image)
-{
- int result;
-
- result = kimage_add_entry(image, IND_DONE);
- if (result == 0) {
- /* Point at the terminating element */
- image->entry--;
- kimage_free_extra_pages(image);
- }
- return result;
-}
-
-#define for_each_kimage_entry(image, ptr, entry) \
- for (ptr = &image->head; (entry = *ptr) && !(entry & IND_DONE); \
- ptr = (entry & IND_INDIRECTION)? \
- phys_to_virt((entry & PAGE_MASK)): ptr +1)
-
-static void kimage_free_entry(kimage_entry_t entry)
-{
- struct page *page;
-
- page = pfn_to_page(entry >> PAGE_SHIFT);
- kimage_free_pages(page);
-}
-
-static void kimage_free(struct kimage *image)
-{
- kimage_entry_t *ptr, entry;
- kimage_entry_t ind = 0;
-
- if (!image)
- return;
- kimage_free_extra_pages(image);
- for_each_kimage_entry(image, ptr, entry) {
- if (entry & IND_INDIRECTION) {
- /* Free the previous indirection page */
- if (ind & IND_INDIRECTION) {
- kimage_free_entry(ind);
- }
- /* Save this indirection page until we are
- * done with it.
- */
- ind = entry;
- }
- else if (entry & IND_SOURCE) {
- kimage_free_entry(entry);
- }
- }
- /* Free the final indirection page */
- if (ind & IND_INDIRECTION) {
- kimage_free_entry(ind);
- }
-
- /* Handle any machine specific cleanup */
- machine_kexec_cleanup(image);
-
- /* Free the kexec control pages... */
- kimage_free_page_list(&image->control_pages);
- kfree(image);
-}
-
-static kimage_entry_t *kimage_dst_used(struct kimage *image, unsigned long page)
-{
- kimage_entry_t *ptr, entry;
- unsigned long destination = 0;
-
- for_each_kimage_entry(image, ptr, entry) {
- if (entry & IND_DESTINATION) {
- destination = entry & PAGE_MASK;
- }
- else if (entry & IND_SOURCE) {
- if (page == destination) {
- return ptr;
- }
- destination += PAGE_SIZE;
- }
- }
- return 0;
-}
-
-static struct page *kimage_alloc_page(struct kimage *image, unsigned int gfp_mask, unsigned long destination)
-{
- /*
- * Here we implement safeguards to ensure that a source page
- * is not copied to its destination page before the data on
- * the destination page is no longer useful.
- *
- * To do this we maintain the invariant that a source page is
- * either its own destination page, or it is not a
- * destination page at all.
- *
- * That is slightly stronger than required, but the proof
- * that no problems will not occur is trivial, and the
- * implementation is simply to verify.
- *
- * When allocating all pages normally this algorithm will run
- * in O(N) time, but in the worst case it will run in O(N^2)
- * time. If the runtime is a problem the data structures can
- * be fixed.
- */
- struct page *page;
- unsigned long addr;
-
- /*
- * Walk through the list of destination pages, and see if I
- * have a match.
- */
- list_for_each_entry(page, &image->dest_pages, lru) {
- addr = page_to_pfn(page) << PAGE_SHIFT;
- if (addr == destination) {
- list_del(&page->lru);
- return page;
- }
- }
- page = NULL;
- while (1) {
- kimage_entry_t *old;
-
- /* Allocate a page, if we run out of memory give up */
- page = kimage_alloc_pages(gfp_mask, 0);
- if (!page) {
- return 0;
- }
- /* If the page cannot be used file it away */
- if (page_to_pfn(page) > (KEXEC_SOURCE_MEMORY_LIMIT >> PAGE_SHIFT)) {
- list_add(&page->lru, &image->unuseable_pages);
- continue;
- }
- addr = page_to_pfn(page) << PAGE_SHIFT;
-
- /* If it is the destination page we want use it */
- if (addr == destination)
- break;
-
- /* If the page is not a destination page use it */
- if (!kimage_is_destination_range(image, addr, addr + PAGE_SIZE))
- break;
-
- /*
- * I know that the page is someones destination page.
- * See if there is already a source page for this
- * destination page. And if so swap the source pages.
- */
- old = kimage_dst_used(image, addr);
- if (old) {
- /* If so move it */
- unsigned long old_addr;
- struct page *old_page;
-
- old_addr = *old & PAGE_MASK;
- old_page = pfn_to_page(old_addr >> PAGE_SHIFT);
- copy_highpage(page, old_page);
- *old = addr | (*old & ~PAGE_MASK);
-
- /* The old page I have found cannot be a
- * destination page, so return it.
- */
- addr = old_addr;
- page = old_page;
- break;
- }
- else {
- /* Place the page on the destination list I
- * will use it later.
- */
- list_add(&page->lru, &image->dest_pages);
- }
- }
- return page;
-}
-
-static int kimage_load_segment(struct kimage *image,
- struct kexec_segment *segment)
-{
- unsigned long mstart;
- int result;
- unsigned long offset;
- unsigned long offset_end;
- unsigned char *buf;
-
- result = 0;
- buf = segment->buf;
- mstart = (unsigned long)segment->mem;
-
- offset_end = segment->memsz;
-
- result = kimage_set_destination(image, mstart);
- if (result < 0) {
- goto out;
- }
- for (offset = 0; offset < segment->memsz; offset += PAGE_SIZE) {
- struct page *page;
- char *ptr;
- size_t size, leader;
- page = kimage_alloc_page(image, GFP_HIGHUSER, mstart + offset);
- if (page == 0) {
- result = -ENOMEM;
- goto out;
- }
- result = kimage_add_page(image, page_to_pfn(page) << PAGE_SHIFT);
- if (result < 0) {
- goto out;
- }
- ptr = kmap(page);
- if (segment->bufsz < offset) {
- /* We are past the end zero the whole page */
- memset(ptr, 0, PAGE_SIZE);
- kunmap(page);
- continue;
- }
- size = PAGE_SIZE;
- leader = 0;
- if ((offset == 0)) {
- leader = mstart & ~PAGE_MASK;
- }
- if (leader) {
- /* We are on the first page zero the unused portion */
- memset(ptr, 0, leader);
- size -= leader;
- ptr += leader;
- }
- if (size > (segment->bufsz - offset)) {
- size = segment->bufsz - offset;
- }
- if (size < (PAGE_SIZE - leader)) {
- /* zero the trailing part of the page */
- memset(ptr + size, 0, (PAGE_SIZE - leader) - size);
- }
- result = copy_from_user(ptr, buf + offset, size);
- kunmap(page);
- if (result) {
- result = (result < 0) ? result : -EIO;
- goto out;
- }
- }
- out:
- return result;
-}
-
-/*
- * Exec Kernel system call: for obvious reasons only root may call it.
- *
- * This call breaks up into three pieces.
- * - A generic part which loads the new kernel from the current
- * address space, and very carefully places the data in the
- * allocated pages.
- *
- * - A generic part that interacts with the kernel and tells all of
- * the devices to shut down. Preventing on-going dmas, and placing
- * the devices in a consistent state so a later kernel can
- * reinitialize them.
- *
- * - A machine specific part that includes the syscall number
- * and the copies the image to it's final destination. And
- * jumps into the image at entry.
- *
- * kexec does not sync, or unmount filesystems so if you need
- * that to happen you need to do that yourself.
- */
-struct kimage *kexec_image = NULL;
-struct kimage *kexec_crash_image = NULL;
-
-asmlinkage long sys_kexec_load(unsigned long entry, unsigned long nr_segments,
- struct kexec_segment *segments, unsigned long flags)
-{
- struct kimage *image;
- int result;
-
- /* We only trust the superuser with rebooting the system. */
- if (!capable(CAP_SYS_BOOT))
- return -EPERM;
-
- if (nr_segments > KEXEC_SEGMENT_MAX)
- return -EINVAL;
-
- image = NULL;
- result = 0;
-
- if (nr_segments > 0) {
- unsigned long i;
- result = kimage_alloc(&image, nr_segments, segments);
- if (result) {
- goto out;
- }
- result = machine_kexec_prepare(image);
- if (result) {
- goto out;
- }
- image->start = entry;
- for (i = 0; i < nr_segments; i++) {
- result = kimage_load_segment(image, &image->segment[i]);
- if (result) {
- goto out;
- }
- }
- result = kimage_terminate(image);
- if (result) {
- goto out;
- }
- }
-
- if (!flags)
- image = xchg(&kexec_image, image);
- else
- image = xchg(&kexec_crash_image, image);
-
- out:
- kimage_free(image);
- return result;
-}
#include <linux/sysrq.h>
#include <linux/interrupt.h>
#include <linux/nmi.h>
+#ifdef CONFIG_KEXEC
#include <linux/kexec.h>
-#include <linux/crash_dump.h>
+#endif
int panic_timeout = 900;
int panic_on_oops = 1;
int tainted;
-unsigned int crashed;
-int crash_dump_on;
void (*dump_function_ptr)(const char *, const struct pt_regs *) = 0;
EXPORT_SYMBOL(panic_timeout);
BUG();
bust_spinlocks(0);
- /* If we have crashed, perform a kexec reboot, for dump write-out */
- crash_machine_kexec();
-
notifier_call_chain(&panic_notifier_list, 0, buf);
#ifdef CONFIG_SMP
#include <linux/init.h>
#include <linux/highuid.h>
#include <linux/fs.h>
-#include <linux/kernel.h>
-#include <linux/kexec.h>
#include <linux/workqueue.h>
#include <linux/device.h>
#include <linux/key.h>
cond_syscall(sys_lookup_dcookie)
cond_syscall(sys_swapon)
cond_syscall(sys_swapoff)
-cond_syscall(sys_kexec_load)
cond_syscall(sys_init_module)
cond_syscall(sys_delete_module)
cond_syscall(sys_socketpair)
#include <linux/config.h>
#include <linux/sched.h>
#include <linux/vs_context.h>
-#include <linux/fs.h>
#include <linux/proc_fs.h>
#include <linux/devpts_fs.h>
#include <linux/namei.h>
return ret;
}
-int vc_iattr_ioctl(struct dentry *de, unsigned int cmd, unsigned long arg)
-{
- void __user *data = (void __user *)arg;
- struct vcmd_ctx_iattr_v1 vc_data;
- int ret;
-
- /*
- * I don't think we need any dget/dput pairs in here as long as
- * this function is always called from sys_ioctl i.e., de is
- * a field of a struct file that is guaranteed not to be freed.
- */
- if (cmd == FIOC_SETIATTR) {
- if (!capable(CAP_SYS_ADMIN) || !capable(CAP_LINUX_IMMUTABLE))
- return -EPERM;
- if (copy_from_user (&vc_data, data, sizeof(vc_data)))
- return -EFAULT;
- ret = __vc_set_iattr(de,
- &vc_data.xid, &vc_data.flags, &vc_data.mask);
- }
- else {
- if (!vx_check(0, VX_ADMIN))
- return -ENOSYS;
- ret = __vc_get_iattr(de->d_inode,
- &vc_data.xid, &vc_data.flags, &vc_data.mask);
- }
-
- if (!ret && copy_to_user (data, &vc_data, sizeof(vc_data)))
- ret = -EFAULT;
- return ret;
-}
-
#ifdef CONFIG_VSERVER_LEGACY
unsigned long min_low_pfn;
EXPORT_SYMBOL(min_low_pfn);
unsigned long max_pfn;
-/*
- * If we have booted due to a crash, max_pfn will be a very low value. We need
- * to know the amount of memory that the previous kernel used.
- */
-unsigned long saved_max_pfn;
EXPORT_SYMBOL(max_pfn); /* This is exported so
* dma_get_required_mask(), which uses
EXPORT_SYMBOL(totalram_pages);
EXPORT_SYMBOL(nr_swap_pages);
-#ifdef CONFIG_CRASH_DUMP
+#ifdef CONFIG_CRASH_DUMP_MODULE
/* This symbol has to be exported to use 'for_each_pgdat' macro by modules. */
EXPORT_SYMBOL(pgdat_list);
#endif
tainted |= TAINT_BAD_PAGE;
}
-#if !defined(CONFIG_HUGETLB_PAGE) && !defined(CONFIG_CRASH_DUMP)
+#if !defined(CONFIG_HUGETLB_PAGE) && !defined(CONFIG_CRASH_DUMP) \
+ && !defined(CONFIG_CRASH_DUMP_MODULE)
#define prep_compound_page(page, order) do { } while (0)
#define destroy_compound_page(page, order) do { } while (0)
#else
%define kversion 2.6.%{sublevel}
%define rpmversion 2.6.%{sublevel}
%define rhbsys %([ -r /etc/beehive-root ] && echo || echo .`whoami`)
-%define release 1.14_FC2.2.planetlab%{?date:.%{date}}
+%define release 1.14_FC2.1.planetlab%{?date:.%{date}}
%define signmodules 0
%define KVERREL %{PACKAGE_VERSION}-%{PACKAGE_RELEASE}